doc/README: update features + api docs
This commit is contained in:
parent
3ff357b7d4
commit
bb60bb989f
11
README.md
11
README.md
@ -5,7 +5,8 @@ A web interface and API for the LLaMA large language AI model, based on the [lla
|
||||
## Features
|
||||
|
||||
- Live streaming responses
|
||||
- Continuation-based UI, supporting interrupt, modify, and resume
|
||||
- Continuation-based UI
|
||||
- Supports interrupt, modify, and resume
|
||||
- Configure the maximum number of simultaneous users
|
||||
- Works with any LLaMA model including [Vicuna](https://huggingface.co/eachadea/ggml-vicuna-13b-4bit)
|
||||
- Bundled copy of llama.cpp, no separate compilation required
|
||||
@ -23,9 +24,11 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
|
||||
|
||||
## API usage
|
||||
|
||||
```
|
||||
curl -v -d '{"ConversationID": "", "APIKey": "", "Content": "The quick brown fox"}' -X 'http://localhost:8090/api/v1/generate'
|
||||
```
|
||||
The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.
|
||||
|
||||
- Usage: `curl -v -X POST -d '{"Content": "The quick brown fox"}' 'http://localhost:8090/api/v1/generate'`
|
||||
- You can optionally supply `ConversationID` and `APIKey` string parameters. However, these are not currently used by the server.
|
||||
- You can optionally supply a `MaxTokens` integer parameter, to cap the number of generated tokens from the LLM.
|
||||
|
||||
## License
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user