Discover and explore top open-source AI tools and projects—updated daily.
ZyphraOpen-weight text-to-speech model for expressive, high-quality speech generation
Top 7.2% on SourcePulse
Zonos-v0.1 is an open-weight text-to-speech model designed for highly natural and expressive speech generation, including zero-shot voice cloning. It targets researchers and developers seeking high-quality, controllable TTS capabilities, offering performance comparable to commercial providers.
How It Works
Zonos utilizes a transformer or hybrid backbone for DAC token prediction, preceded by text normalization and phonemization via eSpeak. This architecture allows for conditioning on speaker embeddings or audio prefixes, enabling fine-grained control over speech rate, pitch, audio quality, and emotions. The model outputs audio natively at 44kHz.
Quick Start & Requirements
uv sync (or uv sync --extra compile for hybrid) followed by uv pip install -e . (or .[compile]).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
8 months ago
Inactive