code.ivysaur.me/llamacpphtmld

mappu a7dd9580a5 doc/README: initial commit

2023-04-08 15:30:37 +12:00

856 B

Raw Blame History

llamacpphtmld

A web interface and API for the LLaMA large language AI model, based on the llama.cpp runtime.

Features

Live streaming responses
Continuation-based UI, supporting interrupt, modify, and resume
Configure the maximum number of simultaneous users
Works with any LLaMA model including Vicuna
Bundled copy of llama.cpp, no separate compilation required

Usage

All configuration should be supplied as environment variables:

LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
	LCH_NET_BIND=:8090 \
	LCH_SIMULTANEOUS_REQUESTS=1 \
	./llamacpphtmld

API usage

curl -v -d '{"ConversationID": "", "APIKey": "", "Content": "The quick brown fox"}' -X 'http://localhost:8090/api/v1/generate'

License

MIT