Voice chat app for interacting with AI characters using speech
Top 90.9% on sourcepulse
This project provides a web UI and CLI for real-time voice chat with AI characters, supporting local execution via Ollama, OpenAI, Anthropic, or xAI for chat, and XTTS, OpenAI, ElevenLabs, or Kokoro for speech. It targets users seeking interactive AI conversations, role-playing, or AI-assisted games and stories, offering flexibility in model and voice provider choices.
How It Works
The application leverages a modular architecture allowing users to mix and match LLM providers (OpenAI, Anthropic, xAI, Ollama) with various TTS and STT services. It supports OpenAI's WebRTC for real-time, interruptible conversations and OpenAI's enhanced TTS models for expressive speech. Local models like XTTS and Faster Whisper are also integrated, with options for GPU acceleration via CUDA. Sentiment analysis is used to adapt AI responses based on user mood.
Quick Start & Requirements
pip install -r requirements.txt
or requirements_cpu.txt
).Highlighted Details
gpt-4o-mini-tts
.Maintenance & Community
The project is actively maintained by bigsk1. Community support channels are not explicitly mentioned in the README.
Licensing & Compatibility
Licensed under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
Local XTTS and Faster Whisper performance is significantly slower on CPU. CUDA setup for Docker requires specific NVIDIA toolkit and cuDNN installations. The README notes that sample character .wav
files are of lower quality and can be replaced.
2 weeks ago
1 day