ChatTTS-OpenVoice  by HKoon

Advanced voice cloning and speech synthesis

Created 1 year ago
462 stars

Top 64.9% on SourcePulse

GitHubView on GitHub
Project Summary

Summary The HKoon/ChatTTS-OpenVoice project fuses ChatTTS for natural speech generation with OpenVoice's tone transplantation module, enabling personalized voice cloning from a short audio clip. It targets users seeking enhanced speech authenticity and offers an experimental LLM integration for advanced text pre-processing.

How It Works This project synergistically combines ChatTTS for its advanced, natural-sounding speech synthesis capabilities with OpenVoice's sophisticated voice timber simulation module. This fusion allows users to upload a brief 10-second audio sample to effectively clone a personalized voice. Furthermore, an experimental LLM integration module (llm.py) provides an OpenAI-compatible API wrapper. This wrapper facilitates text normalization and pre-processing crucial for TTS inference, supporting multiple LLM providers like Kimi, DeepSeek, and MiniMax.

Quick Start & Requirements

  • Demo: A live demo is available on Hugging Face: https://huggingface.co/spaces/Hilley/ChatTTS-OpenVoice.
  • Prerequisites: Users must manually download the OpenVoice Checkpoint files and place them within the ./OpenVoice/checkpoint directory.
  • Dependencies: Utilizing the experimental LLM integration requires obtaining API keys for supported providers (Kimi, DeepSeek, MiniMax). Python code examples are provided for interacting with the MiniMax API.

Highlighted Details

  • Enables high-fidelity personalized voice cloning by merging ChatTTS and OpenVoice technologies.
  • Facilitates seamless tone transplantation, allowing generated speech to adopt the characteristics of a reference voice.
  • Features an experimental LLM integration for robust text pre-processing, including normalization, supporting providers like Kimi (Moonshot AI), DeepSeek, and MiniMax.
  • The MiniMax integration supports multiple models (e.g., M2.7, M2.5) with extensive 204K context windows.

Maintenance & Community No specific details regarding project maintenance, community channels (like Discord or Slack), or notable contributors are present in the provided README.

Licensing & Compatibility The README does not specify the project's license type or provide compatibility notes for commercial use or integration with closed-source projects.

Limitations & Caveats

  • Requires manual download and correct placement of OpenVoice checkpoint files.
  • The LLM integration functionality is explicitly labeled as experimental.
  • No direct installation instructions or package management details (e.g., pip install) are provided beyond the demo and code snippets.
Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
2 more.

metavoice-src by metavoiceio

0.0%
4k
TTS model for human-like, expressive speech
Created 2 years ago
Updated 1 year ago
Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

GPT-SoVITS by RVC-Boss

0.3%
58k
Few-shot voice cloning and TTS web UI
Created 2 years ago
Updated 3 weeks ago
Feedback? Help us improve.