Vocal computing toolbox for building voice interfaces to LLMs
Top 49.6% on sourcepulse
Project S.A.T.U.R.D.A.Y is a modular toolbox for vocal computing, enabling users to build self-hosted, AI-powered voice assistants akin to J.A.R.V.I.S. It targets developers and enthusiasts interested in creating sophisticated voice interfaces for LLMs, offering flexibility to integrate various AI models.
How It Works
The project employs a tool-based architecture, abstracting specific functionalities within "tools." Each tool comprises an "Engine" for domain-specific logic (e.g., voice activity detection) and a "Backend" for AI inference, allowing easy swapping of underlying models. The core tools are Speech-to-Text (STT), Text-to-Text (TTT), and Text-to-Speech (TTS), forming a pipeline for vocal interaction.
Quick Start & Requirements
make rtc
, make tts
, make client
from the project root.pkg-config
, opus
, mecab
, espeak
. Tested on M1 Pro/Max.pip install -r requirements.txt
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is primarily tested on M1 Pro/Max hardware and may require significant processing power. The README notes potential bugs and installation issues, encouraging users to report them. The order of starting server processes is critical for the demo.
2 years ago
1 week