KittenTTS  by KittenML

Realistic text-to-speech model under 25MB

Created 2 months ago
8,928 stars

Top 5.7% on SourcePulse

GitHubView on GitHub
Project Summary

KittenTTS is an open-source, ultra-lightweight text-to-speech model designed for high-quality voice synthesis on any device, even without a GPU. Targeting developers and users needing efficient, realistic TTS, it offers fast inference and premium voice options within a compact model size.

How It Works

KittenTTS utilizes a model with approximately 15 million parameters, optimized for CPU execution. This approach prioritizes efficiency and broad compatibility, enabling real-time speech synthesis on standard hardware without the need for specialized GPUs.

Quick Start & Requirements

  • Primary install: pip install https://github.com/KittenML/KittenTTS/releases/download/0.1/kittentts-0.1.0-py3-none-any.whl
  • Requirements: Python 3.x. No GPU or specific CUDA versions are required.
  • Setup time: Minimal, installation via pip.

Highlighted Details

  • Model size under 25MB.
  • CPU-optimized for inference without a GPU.
  • Offers several premium voice options.
  • Optimized for fast, real-time speech synthesis.

Maintenance & Community

The project is currently in developer preview. A Discord server is available for community engagement.

Licensing & Compatibility

The license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is in developer preview, with the fully trained model weights and mobile/web SDKs yet to be released. The current release is a preview model.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
308 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.