SonicVale by xcLee001

AI voice generation platform for diverse content

Created 9 months ago

579 stars

Top 55.2% on SourcePulse

Project Summary

A multi-character, multi-emotion AI voice generation platform, SonicVale addresses the need for automated, expressive audio creation for content like novels, scripts, and videos. It targets content creators seeking to streamline multimedia production workflows by providing a tool for generating diverse and nuanced voiceovers.

How It Works

SonicVale utilizes an Electron, Vue, and Element Plus frontend paired with a FastAPI Python backend. It integrates with IndexTTS-2.0 based Text-to-Speech (TTS) services, supporting both cloud-native builds (with H20 GPU) and local deployments. A key feature is its compatibility with LLMs via the OpenAI API protocol, allowing for flexible text processing and voice generation. The platform's core approach involves automated dialogue splitting, character and emotion binding, and precise audio editing capabilities.

Quick Start & Requirements

Installation begins with cloning the repository. Users must download ffmpeg.exe and place it in the app/core/ffmpeg/ directory. The backend is set up by installing Python dependencies (pip install -r requirements.txt) and starting the server (uvicorn app.main:app --reload --port 8200). The frontend requires Node.js for dependency installation (npm install) and startup (npm run start). Essential prerequisites include Python, Node.js, and FFmpeg. A demonstration video is available on Bilibili. While detailed documentation is mentioned, a direct URL is not provided.

Highlighted Details

Supports multi-character and multi-emotion voice synthesis.
Automated dialogue splitting from imported text sources like novels and scripts.
Character library management with customizable emotion and voice binding.
Integration with customizable LLM interfaces and TTS configurations.
Advanced audio editing features for fine-tuning segments and adding silence.

Maintenance & Community

Bug reports and feature suggestions should be submitted via GitHub Issues. Community interaction is facilitated through QQ Group 1060711739 (verification: "音谷配音"). Specific details regarding core maintainers, sponsorships, or a public roadmap are not provided in the README.

Licensing & Compatibility

The project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This license mandates that any modifications or derivative works, particularly those offered as a network service (SaaS), must be made publicly available under the same AGPL-3.0 terms.

Limitations & Caveats

SonicVale is explicitly intended for learning and research purposes only. Users are strictly prohibited from engaging in illegal activities, including unauthorized voice cloning or infringing upon intellectual property rights. All associated risks and responsibilities are assumed by the user.

SonicVale by xcLee001

Explore Similar Projects

izwi by izwi-ai

ComfyUI-F5-TTS by niknah

qwen3-TTS-studio by bc-dunia

ComfyUI_IndexTTS by billwuhao

ultimate-rvc by JackismyShephard

alexandria-audiobook by Finrandojin

MOSS-TTSD by OpenMOSS

resonance by code-with-antonio

Chatterbox-TTS-Server by devnen

easyVoice by cosin2077

elevenlabs-python by elevenlabs

Qwen3-TTS by QwenLM