2023-04-08 15:31:24 +12:00
2023-04-08 15:30:37 +12:00
2023-04-08 15:31:24 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:32 +12:00
2023-04-08 15:30:15 +12:00
2023-04-08 15:30:37 +12:00
2023-04-08 15:30:15 +12:00

llamacpphtmld

A web interface and API for the LLaMA large language AI model, based on the llama.cpp runtime.

Features

  • Live streaming responses
  • Continuation-based UI, supporting interrupt, modify, and resume
  • Configure the maximum number of simultaneous users
  • Works with any LLaMA model including Vicuna
  • Bundled copy of llama.cpp, no separate compilation required

Usage

All configuration should be supplied as environment variables:

LCH_MODEL_PATH=/srv/llama/ggml-vicuna-13b-4bit-rev1.bin \
	LCH_NET_BIND=:8090 \
	LCH_SIMULTANEOUS_REQUESTS=1 \
	./llamacpphtmld

API usage

curl -v -d '{"ConversationID": "", "APIKey": "", "Content": "The quick brown fox"}' -X 'http://localhost:8090/api/v1/generate'

License

MIT

Description
A web interface and API for the LLaMA large language AI model, based on the llama.cpp runtime.
Readme 125 KiB
Languages
C 82.1%
C++ 15.3%
Go 2.6%