fast-voice-assistant  by dsa

AI voice assistant demo with <500ms response

created 11 months ago
411 stars

Top 72.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an exceptionally fast AI voice assistant, targeting users who need near real-time voice interaction. It leverages a combination of specialized AI models and efficient transport mechanisms to achieve response times under 500ms.

How It Works

The assistant utilizes a modular architecture for its core components: Deepgram for Speech-to-Text (STT), a Cerebras LLM for natural language understanding and response generation, and Cartesia for Text-to-Speech (TTS). Communication is handled via LiveKit transport, enabling low-latency, real-time audio streaming. This stack is optimized for speed and efficiency.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.x, Deepgram API key, LiveKit Cloud project credentials.
  • Setup: Requires creating a .env file from .env.example and populating it with API keys.
  • Demo: https://cerebras.vercel.app

Highlighted Details

  • Achieves response times under 500ms.
  • Built with approximately 50 lines of code for the core logic.
  • Integrates Deepgram STT, Cerebras LLM, and Cartesia TTS.
  • Uses LiveKit for real-time transport.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project relies on third-party API keys (Deepgram, LiveKit) which may incur costs. The absence of a specified license raises questions about commercial use and redistribution.

Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Starred by Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

ultravox by fixie-ai

0.4%
4k
Multimodal LLM for real-time voice interactions
created 1 year ago
updated 5 days ago
Feedback? Help us improve.