Real-time voice chat with AI using streaming audio
Top 17.0% on sourcepulse
This project provides a real-time, voice-driven conversational AI experience, enabling users to speak naturally with an LLM and receive spoken responses. It's designed for users seeking a fluid, interactive AI companion, offering low-latency communication through a client-server architecture.
How It Works
The system captures voice via the browser, streams audio chunks to a Python backend using WebSockets, and transcribes speech to text using RealtimeSTT. The text is processed by an LLM (Ollama or OpenAI), and the AI's response is synthesized into speech by RealtimeTTS. Audio is streamed back to the browser for playback, with support for interruptions and dynamic silence detection for natural turn-taking.
Quick Start & Requirements
docker compose build
docker compose up -d
docker compose exec ollama ollama pull hf.co/bartowski/huihui-ai_Mistral-Small-24B-Instruct-2501-abliterated-GGUF:Q4_K_M
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
CPU-only or weaker GPU performance will be significantly slower. Manual installation, especially on non-Windows systems or with different CUDA versions, may require troubleshooting.
3 weeks ago
Inactive