S.A.T.U.R.D.A.Y by GRVYDEV

Vocal computing toolbox for building voice interfaces to LLMs

Created 2 years ago

700 stars

Top 48.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elvis Saravia

Founder of DAIR.AI

Project Summary

Project S.A.T.U.R.D.A.Y is a modular toolbox for vocal computing, enabling users to build self-hosted, AI-powered voice assistants akin to J.A.R.V.I.S. It targets developers and enthusiasts interested in creating sophisticated voice interfaces for LLMs, offering flexibility to integrate various AI models.

How It Works

The project employs a tool-based architecture, abstracting specific functionalities within "tools." Each tool comprises an "Engine" for domain-specific logic (e.g., voice activity detection) and a "Backend" for AI inference, allowing easy swapping of underlying models. The core tools are Speech-to-Text (STT), Text-to-Text (TTT), and Text-to-Speech (TTS), forming a pipeline for vocal interaction.

Quick Start & Requirements

Install/Run: make rtc, make tts, make client from the project root.
Prerequisites: Golang, Python, Make, C Compiler, pkg-config, opus, mecab, espeak. Tested on M1 Pro/Max.
Setup: Requires running three processes (RTC server, TTS server, Client) in a specific order. Initial TTS server setup involves pip install -r requirements.txt.
Docs: Getting Started

Highlighted Details

Integrates Pion for WebRTC, whisper.cpp for STT, and Coqui TTS for TTS.
Modular design allows for decoupled AI model upgrades.
Roadmap includes local TTT inference (e.g., llama.cpp) and improved ease of use.
Demo provides a functional J.A.R.V.I.S.-like assistant.

Maintenance & Community

Active community engagement via Discord.
Open to contributions and feature requests.
Community: Discord

Licensing & Compatibility

License: MIT
Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is primarily tested on M1 Pro/Max hardware and may require significant processing power. The README notes potential bugs and installation issues, encouraging users to report them. The order of starting server processes is critical for the demo.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days