Self-hosted voice chat with LLMs
Top 69.6% on sourcepulse
Sage provides a self-hosted, offline voice chat experience with large language models, targeting users who want privacy and control over their AI interactions. It offers low latency and can run on consumer hardware, allowing seamless integration of speech-to-text and LLM responses.
How It Works
Sage utilizes state-of-the-art open-source speech processing models for transcription. For text generation, it supports self-hosted LLMs via Ollama or integrates with third-party providers like Deepseek, OpenAI, Anthropic, and Together.ai. Configuration is managed through a .env
file, specifying API keys and desired models, enabling easy switching between LLM backends.
Quick Start & Requirements
bun docker-build
then bun docker-run
. UI at http://localhost:3000
. Requires Docker.setup-unix.sh
or setup-win.bat
. First macOS run (~20 mins) compiles CoreML models.kokoro-v0_19.onnx
, voices.json
, and ggml-large-v3-turbo.bin
for Docker. Native setup handles downloads.Highlighted Details
Maintenance & Community
The project is actively developed by farshed. Further community engagement channels are not specified in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial or closed-source use is not specified.
Limitations & Caveats
Docker execution is significantly slower (4-5x) for speech inference compared to native builds. CUDA support is listed as future work, indicating current GPU acceleration is not available. The native setup requires a substantial list of development tools.
5 months ago
1 day