Discover and explore top open-source AI tools and projects—updated daily.
hexgradTTS inference library for Kokoro-82M
Top 10.5% on SourcePulse
Kokoro is an open-weight Text-to-Speech (TTS) library built around the Kokoro-82M model, offering high-quality speech generation with significantly improved speed and cost-efficiency compared to larger models. It is designed for developers and researchers seeking a lightweight yet powerful TTS solution deployable across various environments.
How It Works
Kokoro leverages an 82-million parameter model, achieving quality comparable to larger systems. It utilizes the misaki library for G2P (Grapheme-to-Phoneme) conversion, supporting multiple languages. The library is designed for efficient inference, enabling faster generation times and reduced computational overhead.
Quick Start & Requirements
pip install -q kokoro>=0.9.4 soundfileespeak-ng (for English OOD fallback and some non-English languages). Installation instructions for Windows and macOS (MPS GPU acceleration) are provided. A conda environment.yml is available for dependency management.Highlighted Details
misaki G2P library.Maintenance & Community
https://discord.gg/QuGxSWBfQyLicensing & Compatibility
Limitations & Caveats
The misaki library requires separate installation for non-English languages (e.g., pip install misaki[ja] for Japanese, pip install misaki[zh] for Mandarin Chinese). The README mentions espeak-ng is needed for English OOD fallback and some non-English languages, implying potential limitations for unsupported languages without espeak-ng or specific misaki installations.
2 months ago
1 day
PaddlePaddle