Full-stack web UI for LLMs (ChatGPT, LLaMA, etc.)
Top 93.3% on sourcepulse
This project provides a full-stack web UI for interacting with large language models (LLMs) like ChatGPT and local models such as LLaMA. It targets developers and power users looking for a customizable, extensible chat interface with features like web browsing, persistent memory via vector embeddings, and auto-summarization.
How It Works
The backend is built with Python's FastAPI, offering high performance and asynchronous capabilities. The frontend is developed using Flutter, providing a rich, cross-platform UI. Communication between the frontend and backend occurs in real-time via WebSockets. Key features include integration with OpenAI's API, support for local LLMs (llama.cpp, Exllama), vector storage in Redis for conversational memory, and web browsing via Duckduckgo.
Quick Start & Requirements
git clone --recurse-submodules https://github.com/c0sogi/llmchat.git
followed by cd LLMChat
and docker-compose -f docker-compose-local.yaml up
.http://localhost:8000/docs
, chat interface at http://localhost:8000/chat
.llama_models
directory. Exllama requires an NVIDIA CUDA GPU.Highlighted Details
Maintenance & Community
The project is actively maintained by c0sogi. Links to community resources like Discord or Slack are not explicitly provided in the README.
Licensing & Compatibility
Limitations & Caveats
Local LLMs, particularly those requiring significant computation like Exllama, necessitate substantial GPU resources. The README notes that local LLMs cannot handle multiple requests concurrently due to computational expense, with a semaphore limiting requests to one.
1 year ago
1 day