swift  by ai-ng

Voice assistant demo powered by Groq, Cartesia, and Vercel

created 1 year ago
570 stars

Top 57.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Swift is a high-performance AI voice assistant designed for rapid interaction, leveraging cutting-edge AI models for transcription, response generation, and speech synthesis. It targets developers and power users seeking a responsive, voice-first application experience.

How It Works

Swift integrates Groq for accelerated inference of OpenAI's Whisper for speech-to-text and Meta's Llama 3 for text generation. Speech synthesis is handled by Cartesia's Sonic model, streamed directly to the user. Voice Activity Detection (VAD) manages audio input, triggering callbacks on detected speech segments. The application is built with Next.js and TypeScript, deployed on Vercel.

Quick Start & Requirements

  • Install dependencies: pnpm install
  • Start development server: pnpm dev
  • Requires API keys for Groq and Cartesia.
  • Project is a Next.js application.

Highlighted Details

  • Utilizes Groq for low-latency LLM inference.
  • Employs Cartesia Sonic for fast, streamed speech synthesis.
  • Integrates VAD for efficient speech segment detection.
  • Built with Next.js and TypeScript, deployed on Vercel.

Maintenance & Community

The project acknowledges contributions from Groq and Cartesia for API access. Further community or maintenance details are not specified in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial or closed-source use is undetermined.

Limitations & Caveats

The project is presented as a demo, and its production-readiness, scalability, and long-term maintenance are not detailed. API key requirements may incur costs.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
23 stars in the last 90 days

Explore Similar Projects

Starred by Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

ultravox by fixie-ai

0.4%
4k
Multimodal LLM for real-time voice interactions
created 1 year ago
updated 4 days ago
Feedback? Help us improve.