xtts-webui  by daswer123

WebUI for XTTS, a text-to-speech model, and fine-tuning

Created 1 year ago
845 stars

Top 42.3% on SourcePulse

GitHubView on GitHub
Project Summary

XTTS-WebUI provides a user-friendly web interface for the XTTS speech synthesis model, targeting users who want to generate high-quality speech, clone voices, and perform audio tasks. It offers batch processing, translation with voice saving, and integration with other AI voice tools like RVC and OpenVoice, simplifying complex audio manipulation for content creators and developers.

How It Works

The web UI leverages XTTSv2 for speech synthesis and integrates additional neural networks and audio tools for enhanced output quality. It supports batch processing for multiple files and allows for voice cloning and translation. Users can fine-tune XTTS models directly within the interface, enabling the creation of custom, high-quality voice models. The architecture allows for modular integration of tools like RVC, OpenVoice, and Resemble Enhance, offering flexibility in audio post-processing.

Quick Start & Requirements

  • Installation: Run install.bat (Windows) or install.sh (Linux), then start_xtts_webui.bat/.sh.
  • Prerequisites: Python 3.10.x or 3.11, CUDA 11.8 or 12.1, Microsoft Build Tools 2019 (with C++ package), ffmpeg.
  • Hardware: NVIDIA GPU with 6GB VRAM recommended for portable version.
  • Documentation: https://github.com/daswer123/xtts-webui

Highlighted Details

  • Supports batch processing for dubbing large numbers of files.
  • Integrates with RVC, OpenVoice, and Resemble Enhance for advanced audio manipulation.
  • Allows customization of XTTS generation parameters and speaker samples.
  • Offers fine-tuning capabilities for creating custom voice models.

Maintenance & Community

The project is actively maintained. Further community links or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The "Train" tab is noted as broken, with users directed to a separate xtts-finetune-webui for training. The portable version is Windows-only.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
2 more.

metavoice-src by metavoiceio

0.1%
4k
TTS model for human-like, expressive speech
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.