MoeTTS by luoyily

Speech synthesis model/GUI for galgame characters

Created 3 years ago

990 stars

Top 36.8% on SourcePulse

Project Summary

MoeTTS provides a GUI and pre-trained models for synthesizing speech in the style of galgame characters. It targets fans and hobbyists interested in AI-driven voice generation, offering a user-friendly interface to leverage advanced TTS and voice conversion technologies.

How It Works

The project integrates several state-of-the-art speech synthesis models, including Tacotron2, Hifigan, and VITS, along with the Diff-svc model for voice conversion. This combination allows for high-quality speech generation and the ability to transform existing audio into the voice of a chosen character. The GUI simplifies the process of selecting models, inputting text, and applying voice conversion parameters.

Quick Start & Requirements

Installation: Precompiled GUI executables are provided.
Prerequisites: Primarily CPU-based, but a "gpu" branch is available for GPU acceleration. Specific dependencies are managed within the provided executables.
Resources: CPU inference can be time-consuming, especially with features like Crepe enabled.
Links:
- Hugging Face Spaces Demo: https://huggingface.co/spaces/luoyily/MoeTTS

Highlighted Details

Supports both single-character and multi-character VITS models, with a GUI feature for selecting speakers.
Offers integrated voice conversion using Diff-svc, allowing synthesized speech to be further processed by a different voice model.
Includes options for automatic text cleaning (Japanese G2P) and advanced Diff-svc parameters like pitch shifting and Crepe noise reduction.
Precompiled GUIs are available for easier setup, abstracting complex dependencies.

Maintenance & Community

The project states that GUI maintenance is complete and will no longer be actively developed. Model sharing is no longer actively supported. Key contributors include ShiroDoMain and menproject.

Licensing & Compatibility

The project is released under an open-source license but includes additional user agreements. Commercial use of the software, pre-trained models, or derivatives is strictly prohibited. Use for original game production is also forbidden.

Limitations & Caveats

The project is no longer actively maintained, meaning no new features or models will be added. Compatibility is limited to TTS models with unmodified architectures; modified versions like so-vits or emo-vits are not supported. Users are responsible for any consequences arising from the use of provided models, which may originate from community contributions.

MoeTTS by luoyily

Explore Similar Projects

Meta-voicebox by SpeechifyInc

assem-vc by maum-ai

voicebox-pytorch by lucidrains

NeuralSVB by MoonInTheRiver

FastDiff by Rongjiehuang

parrots by shibing624

fish-diffusion by fishaudio

vits-simple-api by Artrajz

parler-tts by huggingface

VoiceCraft by jasonppy

PaddleSpeech by PaddlePaddle

GPT-SoVITS by RVC-Boss