Discover and explore top open-source AI tools and projects—updated daily.
ben0oil1Inference server for voice cloning models
Top 86.9% on SourcePulse
This project provides a simplified inference server for GPT-SoVITS, a leading voice cloning technology. It targets users who have trained models and need an easy-to-deploy solution for voice synthesis, especially on resource-constrained environments like mobile phones or CPU-based servers, abstracting away the complexity of the full GPT-SoVITS project.
How It Works
The server extracts the core inference logic from the original GPT-SoVITS project into a single server.py file. This approach prioritizes minimal dependencies and ease of use, allowing users to run voice cloning with pre-trained models without needing to manage the entire, complex original project. It's designed to be runnable on CPUs, making it accessible for users without expensive GPU hardware.
Quick Start & Requirements
chinese-hubert-base, chinese-roberta-wwm-ext-large) from Hugging Face and place them locally, updating paths in server.py. For Windows, use the provided runtime (runtime/python.exe ./server.py). Ensure ffmpeg.exe is in the same directory as server.py on Windows.ffmpeg.exe (Windows only).Highlighted Details
Maintenance & Community
The project is a personal extraction from the original GPT-SoVITS. Future optimization plans include re-integrating Japanese and English support, code standardization, performance improvements, and potentially a GUI wrapper and Docker packaging.
Licensing & Compatibility
The licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Currently, the project is Chinese-only, with Japanese and English support removed. The README mentions potential path adjustments in server.py's clean_path function, indicating a need for user configuration.
1 year ago
Inactive
WhisperSpeech
metavoiceio
fishaudio
RVC-Boss