fast-voice-assistant  by dsa

AI voice assistant demo with <500ms response

Created 1 year ago
581 stars

Top 55.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an exceptionally fast AI voice assistant, targeting users who need near real-time voice interaction. It leverages a combination of specialized AI models and efficient transport mechanisms to achieve response times under 500ms.

How It Works

The assistant utilizes a modular architecture for its core components: Deepgram for Speech-to-Text (STT), a Cerebras LLM for natural language understanding and response generation, and Cartesia for Text-to-Speech (TTS). Communication is handled via LiveKit transport, enabling low-latency, real-time audio streaming. This stack is optimized for speed and efficiency.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.x, Deepgram API key, LiveKit Cloud project credentials.
  • Setup: Requires creating a .env file from .env.example and populating it with API keys.
  • Demo: https://cerebras.vercel.app

Highlighted Details

  • Achieves response times under 500ms.
  • Built with approximately 50 lines of code for the core logic.
  • Integrates Deepgram STT, Cerebras LLM, and Cartesia TTS.
  • Uses LiveKit for real-time transport.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project relies on third-party API keys (Deepgram, LiveKit) which may incur costs. The absence of a specified license raises questions about commercial use and redistribution.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

Orpheus-TTS by canopyai

0.2%
6k
Open-source TTS for human-sounding speech, built on Llama-3b
Created 10 months ago
Updated 1 month ago
Feedback? Help us improve.