doc/README: initial commit
This commit is contained in:
parent
fa8db95cc6
commit
a7dd9580a5
32
README.md
Normal file
32
README.md
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
# llamacpphtmld
|
||||||
|
|
||||||
|
A web interface and API for the LLaMA large language AI model, based on the [llama.cpp](https://github.com/ggerganov/llama.cpp) runtime.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Live streaming responses
|
||||||
|
- Continuation-based UI, supporting interrupt, modify, and resume
|
||||||
|
- Configure the maximum number of simultaneous users
|
||||||
|
- Works with any LLaMA model including [Vicuna](https://huggingface.co/eachadea/ggml-vicuna-13b-4bit)
|
||||||
|
- Bundled copy of llama.cpp, no separate compilation required
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
All configuration should be supplied as environment variables:
|
||||||
|
|
||||||
|
```
|
||||||
|
LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
|
||||||
|
LCH_NET_BIND=:8090 \
|
||||||
|
LCH_SIMULTANEOUS_REQUESTS=1 \
|
||||||
|
./llamacpphtmld
|
||||||
|
```
|
||||||
|
|
||||||
|
## API usage
|
||||||
|
|
||||||
|
```
|
||||||
|
curl -v -d '{"ConversationID": "", "APIKey": "", "Content": "The quick brown fox"}' -X 'http://localhost:8090/api/v1/generate'
|
||||||
|
```
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
MIT
|
BIN
doc/screenshot.png
Normal file
BIN
doc/screenshot.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 18 KiB |
Loading…
Reference in New Issue
Block a user