AI dubbing/translation tool for multi-language video content creation
Top 18.5% on sourcepulse
Linly-Dubbing is an AI-powered tool for multi-language video dubbing and translation, targeting content creators and businesses aiming for global reach. It automates the process of localizing video content, offering enhanced naturalness and accuracy through advanced AI integrations.
How It Works
Linly-Dubbing orchestrates a pipeline of specialized AI models. It begins with optional video download via yt-dlp
, followed by vocal separation using models like Demucs or UVR5. Speech is transcribed using WhisperX or FunASR, then translated via LLMs such as OpenAI or Qwen. Finally, AI speech synthesis is performed using options like XTTS, CosyVoice, or GPT-SoVITS, with an optional digital human lip-sync layer inspired by Linly-Talker for visual synchronization.
Quick Start & Requirements
requirements.txt
, requirements_module.txt
), and install ffmpeg
.pynini
, yt-dlp
. Optional: pyannote/speaker-diarization-3.1
for speaker diarization..env
file with API keys (OpenAI, Hugging Face, Baidu) and model names.scripts/download_models.sh
(Linux) or scripts/modelscope_download.py
(Windows), then python webui.py
.Highlighted Details
Maintenance & Community
The project is hosted on GitHub by Kedreamix. Links to related projects like Linly-Talker are provided.
Licensing & Compatibility
Licensed under the Apache License 2.0. Users are cautioned to comply with copyright, data protection, and privacy laws, and to obtain necessary permissions before use.
Limitations & Caveats
The installation process is described as very slow. Some advanced features like speaker diarization require explicit access requests. The README notes that large model performance can be limited, recommending more powerful APIs or models.
5 months ago
1 week