From 082fb7552a2a545d6890d927dcde9421c12c3dc0 Mon Sep 17 00:00:00 2001 From: mappu Date: Sun, 9 Apr 2023 11:14:19 +1200 Subject: [PATCH] doc/README: mention GOMAXPROCS is passed through to llama.cpp --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 813e558..1efd801 100644 --- a/README.md +++ b/README.md @@ -22,6 +22,8 @@ LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \ ./llamacpphtmld ``` +Use the `GOMAXPROCS` environment variable to control how many threads the llama.cpp engine uses. + ## API usage The `generate` endpoint will live stream new tokens into an existing conversation until the LLM stops naturally.