Verbi by PromtEngineer

Voice assistant for experimenting with SOTA voice models

Created 1 year ago

1,087 stars

Top 35.0% on SourcePulse

Project Summary

Verbi is a modular voice assistant framework designed for researchers and developers to experiment with and compare state-of-the-art (SOTA) speech and language models. It offers flexibility in swapping components for transcription, response generation, and text-to-speech (TTS), enabling easy evaluation of different AI services and local models.

How It Works

Verbi employs a modular architecture, allowing users to configure different SOTA models via a central config.py file. It supports integrations with cloud APIs like OpenAI, Groq, and Deepgram, as well as local models through Ollama and dedicated local TTS servers (MeloTTS, Piper). This design facilitates rapid prototyping and comparative analysis of voice assistant technologies.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.10+, API keys for cloud services (OpenAI, Groq, Deepgram), optional local model setup (Ollama, FastWhisperAPI, MeloTTS, Piper).
Setup: Clone repo, create virtual environment, install dependencies, set API keys in .env, configure config.py.
Run: python run_voice_assistant.py
Docs: FastWhisperAPI, MeloTTS, Piper

Highlighted Details

Supports multiple cloud APIs: OpenAI, Groq, Deepgram for transcription and response generation.
Integrates ElevenLabs, Deepgram, OpenAI, MeloTTS, and Piper for TTS.
Allows local model integration via Ollama for response generation and FastWhisperAPI for transcription.
Centralized configuration and API key management.

Maintenance & Community

The project is actively maintained by PromtEngineer. Community contributions are welcomed via pull requests.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README does not specify the project's license, which is crucial for determining commercial use and compatibility. Some local TTS models (MeloTTS, Piper) require separate installation and server setup.

Verbi by PromtEngineer

Explore Similar Projects

orate by haydenbleasel

FluidVoice by altic-dev

ai-devices by developersdigest

gpt-voice-conversation-chatbot by Adri6336

june by mezbaul-h

ollama-voice-mac by apeatling

swift by ai-ng

Scriberr by rishikanthc

local-talking-llm by vndee

AI-Waifu-Vtuber by ardha27

QuickAgent by gkamradt

PaddleSpeech by PaddlePaddle