Kokoros  by lucasjinreal

Rust crate for fast, high-quality TTS

created 6 months ago
572 stars

Top 57.2% on sourcepulse

GitHubView on GitHub
Project Summary

Kokoros provides an extremely fast, high-quality Text-to-Speech (TTS) inference engine implemented in Rust, based on the popular Kokoro model. It targets developers and users seeking real-time, embeddable TTS capabilities, offering significant performance gains over Python-based implementations.

How It Works

Kokoros leverages Rust for its performance and memory safety, enabling efficient inference of the 87M parameter Kokoro model. It integrates a phonemizer, removing external dependencies for end-to-end synthesis. The project supports multiple languages (English, Chinese, Japanese, German) with ongoing expansion plans.

Quick Start & Requirements

  • Install Python dependencies: pip install -r scripts/requirements.txt
  • Fetch voice data: python scripts/fetch_voices.py
  • Build the Rust project: cargo build --release
  • Usage: ./target/release/koko [options]
  • Docker image available for easier deployment.
  • See official documentation for detailed usage and server setup.

Highlighted Details

  • Streaming mode supported for real-time audio output.
  • Style mixing for ASMR effects and voice variation.
  • OpenAI-compatible API server for seamless integration.
  • Espeak-ng tokenizer and phonemizer support for end-to-end synthesis.

Maintenance & Community

  • Active development with recent updates including streaming, style mixing, and OpenAI server support.
  • Discord community available for discussion and support: https://discord.gg/E566zfDWqD

Licensing & Compatibility

  • Licensed under the Apache License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is under active development, with some language support noted as "partly" implemented. The OpenAI server compatibility is still undergoing polish.

Health Check
Last commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
1
Star History
74 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

MiniCPM-o by OpenBMB

0.2%
20k
MLLM for vision, speech, and multimodal live streaming on your phone
created 1 year ago
updated 1 month ago
Feedback? Help us improve.