MoeTTS  by luoyily

Speech synthesis model/GUI for galgame characters

created 3 years ago
988 stars

Top 38.3% on sourcepulse

GitHubView on GitHub
Project Summary

MoeTTS provides a GUI and pre-trained models for synthesizing speech in the style of galgame characters. It targets fans and hobbyists interested in AI-driven voice generation, offering a user-friendly interface to leverage advanced TTS and voice conversion technologies.

How It Works

The project integrates several state-of-the-art speech synthesis models, including Tacotron2, Hifigan, and VITS, along with the Diff-svc model for voice conversion. This combination allows for high-quality speech generation and the ability to transform existing audio into the voice of a chosen character. The GUI simplifies the process of selecting models, inputting text, and applying voice conversion parameters.

Quick Start & Requirements

  • Installation: Precompiled GUI executables are provided.
  • Prerequisites: Primarily CPU-based, but a "gpu" branch is available for GPU acceleration. Specific dependencies are managed within the provided executables.
  • Resources: CPU inference can be time-consuming, especially with features like Crepe enabled.
  • Links:

Highlighted Details

  • Supports both single-character and multi-character VITS models, with a GUI feature for selecting speakers.
  • Offers integrated voice conversion using Diff-svc, allowing synthesized speech to be further processed by a different voice model.
  • Includes options for automatic text cleaning (Japanese G2P) and advanced Diff-svc parameters like pitch shifting and Crepe noise reduction.
  • Precompiled GUIs are available for easier setup, abstracting complex dependencies.

Maintenance & Community

The project states that GUI maintenance is complete and will no longer be actively developed. Model sharing is no longer actively supported. Key contributors include ShiroDoMain and menproject.

Licensing & Compatibility

The project is released under an open-source license but includes additional user agreements. Commercial use of the software, pre-trained models, or derivatives is strictly prohibited. Use for original game production is also forbidden.

Limitations & Caveats

The project is no longer actively maintained, meaning no new features or models will be added. Compatibility is limited to TTS models with unmodified architectures; modified versions like so-vits or emo-vits are not supported. Users are responsible for any consequences arising from the use of provided models, which may originate from community contributions.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera).

AudioGPT by AIGC-Audio

0.1%
10k
Audio processing and generation research project
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Lianmin Zheng Lianmin Zheng(Author of SGLang).

fish-speech by fishaudio

0.3%
23k
Open-source TTS for multilingual speech synthesis
created 1 year ago
updated 1 week ago
Starred by Michael Han Michael Han(Cofounder of Unsloth), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

TTS by coqui-ai

0.4%
42k
Deep learning toolkit for Text-to-Speech, research-tested
created 5 years ago
updated 11 months ago
Feedback? Help us improve.