Verbi  by PromtEngineer

Voice assistant for experimenting with SOTA voice models

created 1 year ago
1,042 stars

Top 36.7% on sourcepulse

GitHubView on GitHub
Project Summary

Verbi is a modular voice assistant framework designed for researchers and developers to experiment with and compare state-of-the-art (SOTA) speech and language models. It offers flexibility in swapping components for transcription, response generation, and text-to-speech (TTS), enabling easy evaluation of different AI services and local models.

How It Works

Verbi employs a modular architecture, allowing users to configure different SOTA models via a central config.py file. It supports integrations with cloud APIs like OpenAI, Groq, and Deepgram, as well as local models through Ollama and dedicated local TTS servers (MeloTTS, Piper). This design facilitates rapid prototyping and comparative analysis of voice assistant technologies.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.10+, API keys for cloud services (OpenAI, Groq, Deepgram), optional local model setup (Ollama, FastWhisperAPI, MeloTTS, Piper).
  • Setup: Clone repo, create virtual environment, install dependencies, set API keys in .env, configure config.py.
  • Run: python run_voice_assistant.py
  • Docs: FastWhisperAPI, MeloTTS, Piper

Highlighted Details

  • Supports multiple cloud APIs: OpenAI, Groq, Deepgram for transcription and response generation.
  • Integrates ElevenLabs, Deepgram, OpenAI, MeloTTS, and Piper for TTS.
  • Allows local model integration via Ollama for response generation and FastWhisperAPI for transcription.
  • Centralized configuration and API key management.

Maintenance & Community

The project is actively maintained by PromtEngineer. Community contributions are welcomed via pull requests.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README does not specify the project's license, which is crucial for determining commercial use and compatibility. Some local TTS models (MeloTTS, Piper) require separate installation and server setup.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
77 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.