doc/README: mention GOMAXPROCS is passed through to llama.cpp

This commit is contained in:
mappu 2023-04-09 11:14:19 +12:00
parent 2f4558b68e
commit 082fb7552a

View File

@ -22,6 +22,8 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
./llamacpphtmld
```
Use the `GOMAXPROCS` environment variable to control how many threads the llama.cpp engine uses.
## API usage
The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.