unmute  by kyutai-labs

LLM voice and speech interface

Created 3 months ago
870 stars

Top 41.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Unmute enables text-based Large Language Models (LLMs) to interact audibly, facilitating real-time voice conversations. It's designed for users and developers seeking to integrate speech capabilities into LLM applications, offering a low-latency, flexible system.

How It Works

Unmute employs a pipeline where user speech is transcribed by a Speech-to-Text (STT) model, the resulting text is processed by an LLM, and the LLM's text response is converted to speech by a Text-to-Speech (TTS) model. This architecture prioritizes low latency by optimizing STT and TTS components and allowing integration with various LLM backends like VLLM or external APIs.

Quick Start & Requirements

  • Installation: Recommended via Docker Compose (docker compose up --build).
  • Hardware: GPU with CUDA support and at least 16 GB memory.
  • OS: Linux or Windows with WSL. macOS is not supported.
  • Dependencies: NVIDIA Container Toolkit for Docker. Hugging Face Hub token for LLM access.
  • Setup: Docker Compose setup is described as "Very easy."
  • Documentation: Unmute.sh

Highlighted Details

  • Achieves ~450ms TTS latency on a multi-GPU setup, down from ~750ms on a single GPU.
  • Supports running STT, TTS, and LLM on separate GPUs for performance gains.
  • Frontend is a Next.js app; backend communicates via a protocol based on OpenAI Realtime API.
  • Includes a load testing client for measuring latency and throughput.

Maintenance & Community

  • Project actively encourages issue reporting for troubleshooting.
  • Development pointers are provided for modifying voices, prompts, and swapping frontends.
  • Contributions for features like tool calling are welcomed.

Licensing & Compatibility

  • No explicit license is mentioned in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • Native macOS support is not provided.
  • HTTPS support is omitted from default Docker Compose and Dockerless setups.
  • Docker Swarm deployment is documented for internal use but not supported for debugging.
Health Check
Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
10
Star History
73 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.