Discover and explore top open-source AI tools and projects—updated daily.
Real-time voice chat app with speech-to-speech LLM
Top 33.2% on SourcePulse
A voice chat application demonstrating speech-to-speech language model integration, QuiLLMan targets developers building conversational AI applications. It offers near-instantaneous, human-like conversational responses through advanced audio streaming techniques, serving as a foundation for experimentation and custom LM-based apps.
How It Works
The system utilizes Kyutai Lab's Moshi model for continuous listening, planning, and responding. It employs the Mimi streaming encoder/decoder for unbroken audio input/output and a speech-text foundation model to manage response timing. Bidirectional websocket streaming combined with the Opus audio codec enables low-latency communication, achieving response times that closely mimic human speech cadence on stable internet connections.
Quick Start & Requirements
Development requires the modal
Python package (pip install modal
), a Modal account (modal setup
), and an environment variable for a Modal token (modal token new
). The Moshi websocket server can be started locally using modal serve -m src.moshi
. Testing the websocket connection involves installing development dependencies (pip install -r requirements/requirements-dev.txt
) and running python tests/moshi_client.py
. The frontend and HTTP server are served via modal serve src.app
. Deployment is handled by modal deploy src.app
. Changes are automatically reloaded, though frontend updates may require browser cache clearing.
Highlighted Details
Maintenance & Community
Contributions are explicitly welcomed. No specific community channels, maintainer information, or roadmap details are provided in the README.
Licensing & Compatibility
The README strongly advises users to check the specific license before any commercial use, indicating potential restrictions. No license type (e.g., MIT, Apache) is explicitly stated.
Limitations & Caveats
The code is provided primarily for illustration and experimentation. Users must independently verify licensing terms for commercial applications due to the lack of explicit licensing information.
4 months ago
Inactive