diff --git a/README.md b/README.md index 813e558..1efd801 100644 --- a/README.md +++ b/README.md @@ -22,6 +22,8 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \ ./llamacpphtmld ``` +Use the `GOMAXPROCS` environment variable to control how many threads the llama.cpp engine uses. + ## API usage The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.