kokoro-ios by mlalma

Fast, high-quality text-to-speech for Apple platforms

Created 1 year ago

271 stars

Top 94.8% on SourcePulse

Project Summary

Kokoro TTS for iOS/macOS provides a high-quality, faster-than-real-time English text-to-speech engine for Apple platforms. Aimed at developers integrating speech synthesis into applications, it offers efficient audio generation leveraging Apple's MLX framework.

How It Works

This project ports a PyTorch-based TTS engine (from MLX-Audio) to MLX Swift, enabling native performance on Apple hardware. It utilizes Grapheme-to-Phoneme (G2P) conversion, primarily through the MisakiSwift library, to process input text before neural synthesis. The core advantage lies in its optimized implementation for MLX Swift, achieving significantly faster-than-real-time audio output.

Quick Start & Requirements

Installation is handled via Swift Package Manager: add .package(url: "https://github.com/mlalma/kokoro-ios.git", from: "1.0.0") to your project. The library requires iOS 18.0+ or macOS 15.0+. Key dependencies include MLX Swift, MisakiSwift, and MLXUtilsLibrary. Crucially, users must provide their own Kokoro TTS model files and voice style embeddings, typically included within the integrating application package. Refer to the Kokoro Test App for usage examples.

Highlighted Details

Added token timestamps for finer-grained audio control (v1.0.8).
Voice style management is externalized to the integrating application (v1.0.5).
Achieves approximately 3.3x faster-than-real-time audio generation on an iPhone 13 Pro (release build, post-warm-up).

Maintenance & Community

Specific details regarding maintainers, sponsorships, or community channels (like Discord/Slack) are not present in the provided README snippet.

Licensing & Compatibility

The project is licensed under the MIT License, which is generally permissive for commercial use and integration into closed-source applications.

Limitations & Caveats

Users must independently source and manage the large TTS model files and voice style embeddings. The library mandates relatively recent Apple operating system versions (iOS 18+, macOS 15+). Integration requires familiarity with Swift Package Manager and Apple's development ecosystem.

kokoro-ios by mlalma

Explore Similar Projects

CloneTTS by sipeter

QwenVoice by PowerBeef

csm-mlx by senstella

macparakeet by moona3k

qwen3-tts-apple-silicon by kapi2800

speech-swift by soniqo

mlx-audio-swift by Blaizzy

my-translator by phuc-nt

argmax-oss-swift by argmaxinc

neutts by neuphonic

mlx-audio by Blaizzy

KittenTTS by KittenML