Discover and explore top open-source AI tools and projects—updated daily.
index-ttsZero-shot TTS system for industrial use
Top 3.4% on SourcePulse
IndexTTS is an industrial-level zero-shot text-to-speech system designed for high-quality, controllable voice synthesis, particularly excelling in Chinese language scenarios. It targets researchers and developers seeking advanced TTS capabilities, offering state-of-the-art performance and features like pronunciation correction and precise pause control.
How It Works
IndexTTS builds upon XTTS and Tortoise, integrating a conformer conditioning encoder and a BigVGAN2-based speechcode decoder. This architecture enhances training stability, speaker similarity, and audio quality. A key innovation is its character-pinyin hybrid modeling for accurate Chinese pronunciation, alongside punctuation-based pause control.
Quick Start & Requirements
pip install -r requirements.txt and pip install -e . for CLI usage.ffmpeg, PyTorch. Model weights must be downloaded to a checkpoints directory.pip install -e ".[webui]" && python webui.pyHighlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
pynini, requiring a conda installation.5 days ago
1 day
fishaudio
RVC-Boss