API endpoint for GPT-SoVITS voice cloning
Top 90.6% on sourcepulse
This project provides an enhanced API for GPT-SoVITS, a zero/few-shot Chinese voice cloning model. It addresses limitations in the original API, such as poor handling of mixed-language input and sentence splitting, offering a more robust interface for developers and researchers integrating voice cloning into applications.
How It Works
The API is built upon the GPT-SoVITS framework, offering improved text processing capabilities. It supports splitting text by punctuation for more natural speech synthesis and allows for mixed-language inputs within a single request. The API can be configured to use default reference audio or accept specific reference audio paths and text during API calls.
Quick Start & Requirements
api2.py
and placing it in the GPT-SoVITS directory..\runtime\python api2.py
(or python api2.py
on Linux).Highlighted Details
Maintenance & Community
This is a community-driven enhancement to the GPT-SoVITS project. Further community support and development can be found via the original GPT-SoVITS repository.
Licensing & Compatibility
The licensing details are not explicitly stated in the README for this specific API enhancement. It inherits the licensing of the underlying GPT-SoVITS project, which is typically Apache 2.0, allowing for commercial use and integration with closed-source applications.
Limitations & Caveats
The API does not support dynamic model switching; separate API servers must be launched for different models on different ports. The README does not specify version compatibility with the core GPT-SoVITS project.
1 year ago
Inactive