Discover and explore top open-source AI tools and projects—updated daily.
Soul-AILabZero-shot singing voice synthesis for high-fidelity audio
New!
Top 79.8% on SourcePulse
Summary
SoulX-Singer offers high-quality, zero-shot singing voice synthesis, enabling realistic voice generation for unseen singers without fine-tuning. It targets researchers and developers, providing flexible melody/rhythm control, timbre cloning across languages, and singing voice editing.
How It Works
This project utilizes a zero-shot singing voice synthesis model trained on over 42,000 hours of aligned vocal data. It supports melody-conditioned (F0) and score-conditioned (MIDI) inputs for precise control. Key advantages include timbre cloning across languages, cross-lingual synthesis by disentangling timbre from content, and singing voice editing while preserving natural prosody.
Quick Start & Requirements
conda create -n soulxsinger -y python=3.10), activate it, and install dependencies (pip install -r requirements.txt).hf download Soul-AILab/SoulX-Singer --local-dir pretrained_models/SoulX-Singer and hf download Soul-AILab/SoulX-Singer-Preprocess --local-dir pretrained_models/SoulX-Singer-Preprocess).bash example/infer.sh or launch the interactive WebUI with python webui.py.Highlighted Details
Maintenance & Community
Recent releases (February 2026) include an evaluation dataset, an online demo, a MIDI editor, and the inference code/models. The roadmap indicates completed items like the WebUI and online demos, with future plans for MIDI-based input support and comprehensive tutorials. Contact is available via email (qianjiale@soulapp.cn, menghao@soulapp.cn, wangxinsheng@soulapp.cn) and WeChat/Soul APP groups for discussions.
Licensing & Compatibility
Licensed under the Apache 2.0 license, permitting free research and developer use. A usage disclaimer emphasizes responsible use, respecting intellectual property and consent, and prohibits impersonation or deceptive audio creation. Developers disclaim liability for misuse.
Limitations & Caveats
The automatic preprocessing pipeline may yield imperfect alignment between singing audio and corresponding lyrics/musical notes, necessitating manual correction via the provided MIDI-Editor for optimal synthesis quality.
2 days ago
Inactive
canopyai
myshell-ai