doc/README: mention GOMAXPROCS is passed through to llama.cpp
This commit is contained in:
parent
2f4558b68e
commit
082fb7552a
@ -22,6 +22,8 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
|
||||
./llamacpphtmld
|
||||
```
|
||||
|
||||
Use the `GOMAXPROCS` environment variable to control how many threads the llama.cpp engine uses.
|
||||
|
||||
## API usage
|
||||
|
||||
The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.
|
||||
|
Loading…
Reference in New Issue
Block a user