serge by serge-chat

Web interface for chatting with Alpaca models

Created 2 years ago

5,751 stars

Top 8.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jaret Burkett

Founder of Ostris

Project Summary

Serge provides a self-hosted web interface for interacting with large language models (LLMs) like LLaMA via the llama.cpp backend. It's designed for users who want to run LLMs locally without relying on external APIs or managing complex setups, offering a user-friendly chat experience and an API for programmatic access.

How It Works

Serge utilizes a SvelteKit frontend for the user interface and a FastAPI backend powered by LangChain. The backend interfaces with llama.cpp through its Python bindings to run LLM models. Redis is employed for storing chat history and user parameters, ensuring persistence and efficient retrieval. This architecture allows for a fully dockerized, self-contained solution that simplifies LLM deployment and interaction.

Quick Start & Requirements

Install/Run: Docker or Docker Compose.
Prerequisites: Docker Desktop (Windows), WSL2 (Windows), sufficient RAM for LLM models.
Setup: Minimal, primarily involves running the provided Docker commands.
Docs: API documentation available at http://localhost:8008/api/docs.

Highlighted Details

Fully dockerized for easy deployment.
Self-hosted, requiring no API keys.
Includes a SvelteKit frontend and FastAPI API.
Uses Redis for chat history and parameter storage.

Maintenance & Community

Active development with contributions welcomed via issues and PRs.
Community support available via Discord.

Licensing & Compatibility

Licensed under MIT and Apache-2.0 licenses.
Permissive licenses suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project's memory usage is heavily dependent on the LLM model being run, and insufficient RAM will lead to crashes. While the core functionality is stable, specific model compatibility or performance may vary.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days