KittenTTS by KittenML

Realistic text-to-speech model under 25MB

Created 11 months ago

15,204 stars

Top 3.4% on SourcePulse

View on GitHub

7 Experts Love This Project

Jonathan Ragan-Kelley

and 3 more!

Project Summary

KittenTTS is an open-source, ultra-lightweight text-to-speech model designed for high-quality voice synthesis on any device, even without a GPU. Targeting developers and users needing efficient, realistic TTS, it offers fast inference and premium voice options within a compact model size.

How It Works

KittenTTS utilizes a model with approximately 15 million parameters, optimized for CPU execution. This approach prioritizes efficiency and broad compatibility, enabling real-time speech synthesis on standard hardware without the need for specialized GPUs.

Quick Start & Requirements

Primary install: pip install https://github.com/KittenML/KittenTTS/releases/download/0.1/kittentts-0.1.0-py3-none-any.whl
Requirements: Python 3.x. No GPU or specific CUDA versions are required.
Setup time: Minimal, installation via pip.

Highlighted Details

Model size under 25MB.
CPU-optimized for inference without a GPU.
Offers several premium voice options.
Optimized for fast, real-time speech synthesis.

Maintenance & Community

The project is currently in developer preview. A Discord server is available for community engagement.

Licensing & Compatibility

The license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is in developer preview, with the fully trained model weights and mobile/web SDKs yet to be released. The current release is a preview model.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1,053 stars in the last 30 days