unmute  by kyutai-labs

LLM voice and speech interface

created 1 month ago
724 stars

Top 48.6% on sourcepulse

GitHubView on GitHub
Project Summary

Unmute enables text-based Large Language Models (LLMs) to interact audibly, facilitating real-time voice conversations. It's designed for users and developers seeking to integrate speech capabilities into LLM applications, offering a low-latency, flexible system.

How It Works

Unmute employs a pipeline where user speech is transcribed by a Speech-to-Text (STT) model, the resulting text is processed by an LLM, and the LLM's text response is converted to speech by a Text-to-Speech (TTS) model. This architecture prioritizes low latency by optimizing STT and TTS components and allowing integration with various LLM backends like VLLM or external APIs.

Quick Start & Requirements

  • Installation: Recommended via Docker Compose (docker compose up --build).
  • Hardware: GPU with CUDA support and at least 16 GB memory.
  • OS: Linux or Windows with WSL. macOS is not supported.
  • Dependencies: NVIDIA Container Toolkit for Docker. Hugging Face Hub token for LLM access.
  • Setup: Docker Compose setup is described as "Very easy."
  • Documentation: Unmute.sh

Highlighted Details

  • Achieves ~450ms TTS latency on a multi-GPU setup, down from ~750ms on a single GPU.
  • Supports running STT, TTS, and LLM on separate GPUs for performance gains.
  • Frontend is a Next.js app; backend communicates via a protocol based on OpenAI Realtime API.
  • Includes a load testing client for measuring latency and throughput.

Maintenance & Community

  • Project actively encourages issue reporting for troubleshooting.
  • Development pointers are provided for modifying voices, prompts, and swapping frontends.
  • Contributions for features like tool calling are welcomed.

Licensing & Compatibility

  • No explicit license is mentioned in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • Native macOS support is not provided.
  • HTTPS support is omitted from default Docker Compose and Dockerless setups.
  • Docker Swarm deployment is documented for internal use but not supported for debugging.
Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
16
Issues (30d)
15
Star History
736 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
1 more.

SillyTavern by SillyTavern

3.2%
17k
LLM frontend for power users
created 2 years ago
updated 3 days ago
Feedback? Help us improve.