doc/README: mention GOMAXPROCS is passed through to llama.cpp
This commit is contained in:
parent
2f4558b68e
commit
082fb7552a
@ -22,6 +22,8 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
|
|||||||
./llamacpphtmld
|
./llamacpphtmld
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Use the `GOMAXPROCS` environment variable to control how many threads the llama.cpp engine uses.
|
||||||
|
|
||||||
## API usage
|
## API usage
|
||||||
|
|
||||||
The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.
|
The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.
|
||||||
|
Loading…
Reference in New Issue
Block a user