Discover and explore top open-source AI tools and projects—updated daily.
matatonicOpenAI API-compatible server for text-to-speech
Top 43.0% on SourcePulse
This project provides an OpenAI API-compatible text-to-speech server, enabling users to run their own private TTS service. It supports both fast CPU-based generation via Piper TTS and high-quality, voice-cloning capabilities using Coqui AI's XTTS v2, targeting developers and users who need a self-hosted, flexible TTS solution.
How It Works
The server exposes an endpoint mirroring OpenAI's /v1/audio/speech API. It leverages Piper TTS for rapid, CPU-bound speech synthesis, allowing for custom voice mapping. For higher fidelity and voice cloning, it integrates Coqui XTTS v2, which requires a GPU with approximately 4GB VRAM. XTTS v2 offers multilingual support with automatic language detection and the ability to use custom fine-tuned models.
Quick Start & Requirements
docker compose updocker compose -f docker-compose.rocm.yml updocker compose -f docker-compose.min.yml upcurl, ffmpeg. Nvidia GPU with CUDA or AMD GPU with ROCm for respective backends.Highlighted Details
tts-1 and tts-1-hd models with configurable voices (alloy, echo, fable, etc.).mp3, opus, aac, flac, and pcm formats with adjustable speech speed.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
9 months ago
Inactive