Generative speech model for daily dialogue
Top 0.8% on sourcepulse
ChatTTS is a generative speech model optimized for dialogue scenarios, targeting LLM assistants and conversational AI applications. It provides natural, expressive speech synthesis with fine-grained control over prosody, including laughter and pauses, aiming to surpass existing open-source TTS models in conversational quality.
How It Works
ChatTTS employs a novel approach for dialogue-centric speech synthesis, enabling natural and expressive vocalizations. It supports multiple speakers and offers fine-grained control over prosodic features like laughter, pauses, and interjections. The model is trained on a substantial dataset of Chinese and English audio, with a focus on delivering superior prosody compared to other open-source TTS solutions.
Quick Start & Requirements
pip install ChatTTS
or pip install -e .
for local development.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The released model is for academic and research purposes only and cannot be used commercially. The authors have intentionally added noise and compressed audio quality to deter malicious use. English synthesis is noted as experimental.
3 weeks ago
1 day