GPT-SoVITS-Server by ben0oil1

Inference server for voice cloning models

Created 1 year ago

314 stars

Top 86.1% on SourcePulse

Project Summary

This project provides a simplified inference server for GPT-SoVITS, a leading voice cloning technology. It targets users who have trained models and need an easy-to-deploy solution for voice synthesis, especially on resource-constrained environments like mobile phones or CPU-based servers, abstracting away the complexity of the full GPT-SoVITS project.

How It Works

The server extracts the core inference logic from the original GPT-SoVITS project into a single server.py file. This approach prioritizes minimal dependencies and ease of use, allowing users to run voice cloning with pre-trained models without needing to manage the entire, complex original project. It's designed to be runnable on CPUs, making it accessible for users without expensive GPU hardware.

Quick Start & Requirements

Install: Download pre-trained models (chinese-hubert-base, chinese-roberta-wwm-ext-large) from Hugging Face and place them locally, updating paths in server.py. For Windows, use the provided runtime (runtime/python.exe ./server.py). Ensure ffmpeg.exe is in the same directory as server.py on Windows.
Prerequisites: Python, pre-trained models, ffmpeg.exe (Windows only).
Setup: Minimal, focused on downloading models and configuring paths.

Highlighted Details

Designed for CPU inference, making voice cloning accessible without GPUs.
Successfully tested on a mobile phone, demonstrating extreme portability.
Focuses solely on Chinese language support, simplifying the codebase.
Aims to abstract away complex environment setup for end-users.

Maintenance & Community

The project is a personal extraction from the original GPT-SoVITS. Future optimization plans include re-integrating Japanese and English support, code standardization, performance improvements, and potentially a GUI wrapper and Docker packaging.

Licensing & Compatibility

The licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Currently, the project is Chinese-only, with Japanese and English support removed. The README mentions potential path adjustments in server.py's clean_path function, indicating a need for user configuration.

GPT-SoVITS-Server by ben0oil1

Explore Similar Projects

MahaTTS by dubverse-ai

gptsovits-api by jianchang512

xtts2-ui by BoltzmannEntropy

vits-simple-api by Artrajz

xtts-webui by daswer123

WhisperSpeech by WhisperSpeech

metavoice-src by metavoiceio

VITS-fast-fine-tuning by Plachtaa

Spark-TTS by SparkAudio

fish-speech by fishaudio

CosyVoice by FunAudioLLM

GPT-SoVITS by RVC-Boss