TTS models for Indic languages
Top 98.2% on sourcepulse
This project provides state-of-the-art Text-to-Speech (TTS) models for 13 Indian languages, addressing the under-representation of these languages in speech synthesis research. It targets developers and researchers working with Indian languages, offering improved speech quality over existing models.
How It Works
The system utilizes a unified architecture based on FastPitch for acoustic modeling and HiFi-GAN V1 for vocoding. Models are trained jointly on male and female speakers, a configuration identified through extensive evaluation of acoustic models, vocoders, loss functions, and training schedules. This approach yields superior performance across Dravidian and Indo-Aryan languages.
Quick Start & Requirements
Trainer
and TTS
, install dependencies using pip3 install -e .[all]
for both, and then pip3 install -r requirements.txt
.libsndfile1-dev
, ffmpeg
, enchant
.Highlighted Details
coqui-ai/TTS
library.Maintenance & Community
Licensing & Compatibility
coqui-ai/TTS
library is typically Apache 2.0, but this specific fork's license requires verification.Limitations & Caveats
Trainer
library, indicating potential instability or ongoing development.8 months ago
Inactive