kokoro-onnx by thewh1teagle

Text-to-speech with ONNX Runtime

Created 1 year ago

2,389 stars

Top 18.8% on SourcePulse

Project Summary

This project provides a text-to-speech (TTS) system leveraging the Kokoro-TTS model and ONNX Runtime for efficient inference. It targets developers and users seeking a fast, lightweight, and multilingual TTS solution, offering near real-time performance on Apple Silicon (M1) and a compact model size.

How It Works

The system utilizes ONNX Runtime for accelerated model execution, enabling fast inference speeds. It supports multiple languages and voices, with version 1.0 models available. The architecture is designed for efficiency, with quantized models being particularly lightweight.

Quick Start & Requirements

Install uv for isolated Python environments: pip install uv
Initialize a new project with Python 3.12: uv init -p 3.12
Add the package: uv add kokoro-onnx soundfile
Download model files: kokoro-v1.0.onnx and voices-v1.0.bin.
Place examples/save.py content into hello.py and ensure model files are in the same directory.
Run the script: uv run hello.py

Highlighted Details

Fast performance, near real-time on macOS M1.
Lightweight model size (~300MB, quantized ~80MB).
Supports multiple languages and voices.

Maintenance & Community

Information regarding community channels, roadmap, or specific maintainers is not detailed in the provided README.

Licensing & Compatibility

The kokoro-onnx package is licensed under MIT. The Kokoro model itself is licensed under Apache 2.0. These licenses appear compatible with most commercial and closed-source applications.

Limitations & Caveats

The README recommends using the misaki g2p package for version 1.0 models, suggesting potential compatibility considerations or performance benefits with this specific g2p implementation. Further details on community support or project roadmap are not readily available.

kokoro-onnx by thewh1teagle

Explore Similar Projects

LunaVox by Lux-Luna

vllm-mlx by waybarrios

macos-local-voice-agents by kwindla

Genie-TTS by High-Logic

valtec-tts by tronghieuit

sherpa by k2-fsa

ten-vad by TEN-framework

pocket-tts by kyutai-labs

DH_live by kleinlee

onnx-mlir by onnx

transformers.js by huggingface

vosk-api by alphacep