Discover and explore top open-source AI tools and projects—updated daily.
Local voice agents for macOS
Top 94.1% on SourcePulse
This repository provides a framework for building real-time voice AI applications on macOS, specifically demonstrating a voice agent that runs entirely locally using the Pipecat framework. It's designed for developers and researchers interested in low-latency, on-device voice AI, offering a potential voice-to-voice latency under 800ms on M-series Macs with strong models.
How It Works
The system utilizes a pipeline of local models for various voice processing tasks, including Silero VAD for voice activity detection, MLX Whisper for speech-to-text, Gemma3n 4B for language understanding, and Kokoro TTS for text-to-speech. Communication between the agent and client is handled via a low-latency, serverless WebRTC connection, optimized for real-time audio. The architecture is modular, allowing for easy swapping of models and customization of the pipeline, including tool calling and parallel processing.
Quick Start & Requirements
server/
, and install dependencies using uv run bot.py
or python3.12 -m venv venv && source venv/bin/activate && pip install -r requirements.txt && python bot.py
.client/
, run npm i
, and then npm run dev
. Access the client via the URL provided in the terminal.Highlighted Details
Maintenance & Community
The project is maintained by kwindla. Further community and contribution details are not specified in the README.
Licensing & Compatibility
The repository's licensing is not explicitly stated in the provided README content. Compatibility for commercial use or closed-source linking is therefore undetermined.
Limitations & Caveats
The initial startup time for model loading can exceed 30 seconds. While the README suggests setting HF_HUB_OFFLINE=1
for faster subsequent startups, the core dependencies and licensing for commercial use require further clarification.
3 weeks ago
Inactive