open-tts-tracker  by Vaibhavs10

Open TTS Tracker: resource for open-access TTS models

Created 1 year ago
1,135 stars

Top 33.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated catalog of open-source Text-to-Speech (TTS) models, aiming to increase awareness and accessibility for researchers, developers, and enthusiasts. It provides a centralized resource to track the latest advancements in open-access TTS technology.

How It Works

The project maintains a comprehensive table that lists various open-source TTS models. Each entry includes links to the model's repository, pre-trained weights (often on Hugging Face Hub), license information, language support, and availability of papers and demos. It also details specific capabilities like processor requirements (CPU/CUDA), phonetic alphabet support, voice cloning, emotional control, prompting, streaming, and speech control features.

Quick Start & Requirements

This repository is a tracking list, not a runnable application. To use any of the listed TTS models, users must refer to the individual model's repository for installation and usage instructions.

Highlighted Details

  • Comprehensive comparison of over 30 open-source TTS models.
  • Detailed feature breakdown including processor requirements, phonetic alphabets, voice cloning, emotional control, and streaming support.
  • Categorization by license type, language support, and availability of demos and papers.
  • Focuses exclusively on open-source and open-access codebase TTS models.

Maintenance & Community

The project is maintained by Vaibhavs10. Users are encouraged to contribute by submitting Pull Requests for missing models or creating demos on Hugging Face Hub. Direct contact is available via Twitter @reach_vb.

Licensing & Compatibility

The repository itself does not have a specific license. However, the listed TTS models have a variety of licenses, including MIT, Apache 2.0, CC-BY-NC 4.0, CC-BY, BSD-3, MPL 2.0, CPML, and GPL-3.0. Some licenses, like CC-BY-NC 4.0 and CPML, have non-commercial restrictions. GPL-licensed models may have implications for closed-source derivative works.

Limitations & Caveats

The list is a snapshot and may not be exhaustive or perfectly up-to-date. Some models are marked as "Unofficial Repo" or have "Not Available" for certain details, indicating potential gaps in information or community support. Some models have specific non-commercial clauses or dependencies on other GPL-licensed components.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Alex Chen Alex Chen(Cofounder of Nexa AI), David Singleton David Singleton(Cofounder of /dev/agents; Ex-CTO of Stripe), and
1 more.

kokoro by hexgrad

1.2%
4k
TTS inference library for Kokoro-82M
Created 8 months ago
Updated 1 month ago
Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

GPT-SoVITS by RVC-Boss

0.3%
51k
Few-shot voice cloning and TTS web UI
Created 1 year ago
Updated 1 week ago
Feedback? Help us improve.