MeloTTS by myshell-ai

Multilingual text-to-speech library

Created 2 years ago

7,533 stars

Top 6.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

MeloTTS is a high-quality, multi-lingual text-to-speech library designed for researchers and developers. It offers advanced TTS capabilities across multiple languages and accents, enabling the creation of natural-sounding speech for various applications.

How It Works

MeloTTS leverages a VITS-based architecture, building upon VITS, VITS2, and Bert-VITS2. This approach allows for high-quality speech synthesis with efficient inference, even supporting real-time CPU usage. A key feature is its multi-lingual support, including various English accents and mixed-language capabilities for Chinese.

Quick Start & Requirements

Install: pip install melo-tts
Prerequisites: Python. Models are available on HuggingFace.
Resources: Fast enough for CPU real-time inference.
Docs: https://github.com/myshell-ai/MeloTTS

Highlighted Details

Supports 10 languages and multiple English accents (American, British, Indian, Australian).
Chinese speaker supports mixed English and Chinese input.
Optimized for fast CPU real-time inference.
Based on VITS, VITS2, and Bert-VITS2 architectures.

Maintenance & Community

The project is led by MyShell.ai and includes contributors from Tsinghua University and MIT. Further contributions are welcomed.

Licensing & Compatibility

Released under the MIT License, permitting free commercial and non-commercial use.

Limitations & Caveats

The README does not detail specific hardware requirements beyond CPU inference speed, nor does it mention potential limitations regarding specific language nuances or model sizes.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

70 stars in the last 30 days