quillman  by modal-labs

Real-time voice chat app with speech-to-speech LLM

Created 2 years ago
1,179 stars

Top 32.8% on SourcePulse

GitHubView on GitHub
Project Summary

A voice chat application demonstrating speech-to-speech language model integration, QuiLLMan targets developers building conversational AI applications. It offers near-instantaneous, human-like conversational responses through advanced audio streaming techniques, serving as a foundation for experimentation and custom LM-based apps.

How It Works

The system utilizes Kyutai Lab's Moshi model for continuous listening, planning, and responding. It employs the Mimi streaming encoder/decoder for unbroken audio input/output and a speech-text foundation model to manage response timing. Bidirectional websocket streaming combined with the Opus audio codec enables low-latency communication, achieving response times that closely mimic human speech cadence on stable internet connections.

Quick Start & Requirements

Development requires the modal Python package (pip install modal), a Modal account (modal setup), and an environment variable for a Modal token (modal token new). The Moshi websocket server can be started locally using modal serve -m src.moshi. Testing the websocket connection involves installing development dependencies (pip install -r requirements/requirements-dev.txt) and running python tests/moshi_client.py. The frontend and HTTP server are served via modal serve src.app. Deployment is handled by modal deploy src.app. Changes are automatically reloaded, though frontend updates may require browser cache clearing.

Highlighted Details

  • Powered by Kyutai Lab's Moshi speech-to-speech model.
  • Features Mimi streaming encoder/decoder for continuous audio.
  • Leverages bidirectional websockets and Opus codec for low-latency audio.
  • Intended as a starting point for language model-based applications.

Maintenance & Community

Contributions are explicitly welcomed. No specific community channels, maintainer information, or roadmap details are provided in the README.

Licensing & Compatibility

The README strongly advises users to check the specific license before any commercial use, indicating potential restrictions. No license type (e.g., MIT, Apache) is explicitly stated.

Limitations & Caveats

The code is provided primarily for illustration and experimentation. Users must independently verify licensing terms for commercial applications due to the lack of explicit licensing information.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0.4%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.2%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0.0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 3 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 3 weeks ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
13k
Curated MLOps knowledge hub
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.