Open-source tool for translating/dubbing YouTube videos into Chinese
Top 72.7% on sourcepulse
YouDub is an open-source tool designed to automate the translation and dubbing of YouTube videos into Chinese, preserving the original speaker's voice. It targets content creators and consumers looking to localize high-quality video content for the Chinese internet. The primary benefit is the creation of Chinese-dubbed videos with the original YouTuber's vocal characteristics.
How It Works
YouDub leverages a pipeline of AI technologies. It uses OpenAI's Whisper for accurate speech-to-text conversion, followed by large language models (like GPT-3.5-turbo or GPT-4) for translating the transcribed text into Chinese. Finally, it employs AI voice cloning, currently using Paddle Speech, to generate Chinese audio that mimics the original speaker's tone and timbre. The system integrates these steps to ensure synchronized audio and video output.
Quick Start & Requirements
pip install -r requirements.txt
(ensure PyTorch is installed with appropriate CUDA version if needed, e.g., pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
)..env
file with OPENAI_API_KEY
, MODEL_NAME
, HF_TOKEN
(for speaker diarization), and optionally OPENAI_API_BASE
or APPID
/ACCESS_TOKEN
for alternative TTS.python main.py --input_folders /path/to/input --output_folders /path/to/output [--diarize]
Highlighted Details
--diarize
flag enables speaker diarization using pyannote
.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The current AI voice cloning (Paddle Speech) cannot simultaneously generate Chinese and English within the same sentence. Some TTS options (Volcano Engine) may incur costs. Using gpt-4
for translation can be expensive.
1 year ago
1 week