VideoLingo by Huanshere

AI tool for automated video translation, localization, and dubbing

Created 1 year ago

16,027 stars

Top 3.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

VideoLingo is an all-in-one AI-powered tool for video localization, offering automated subtitle generation, translation, alignment, and dubbing to create Netflix-quality content. It targets content creators and distributors aiming to overcome language barriers and expand their reach globally.

How It Works

VideoLingo leverages WhisperX for word-level subtitle recognition and alignment, followed by NLP for intelligent segmentation. Translations are enhanced with custom or AI-generated terminology for accuracy. A "Translate-Reflect-Adaptation" process refines output for cinematic quality, focusing exclusively on single-line subtitles. Dubbing is integrated using various TTS engines like GPT-SoVITS, Azure, and OpenAI.

Quick Start & Requirements

Install: Clone the repository, activate a Python 3.10 conda environment, and run python install.py.
Prerequisites: Python 3.10, FFmpeg (install via package manager), CUDA Toolkit 12.6 and CUDNN 9.3.0 for Windows NVIDIA users. Docker requires CUDA 12.4 and NVIDIA Driver >550.
Run: streamlit run st.py
Docs: English, 中文

Highlighted Details

Automated subtitle cutting, translation, alignment, and dubbing.
Generates Netflix-standard, single-line subtitles.
Supports multiple TTS engines including GPT-SoVITS for voice cloning.
One-click startup via Streamlit UI.

Maintenance & Community

Project is actively maintained by Huanshere.
Contact via GitHub Issues/PRs, Twitter (@Huanshere), or email (team@videolingo.io).

Licensing & Compatibility

Licensed under Apache 2.0.
Compatible with commercial use.

Limitations & Caveats

WhisperX transcription can be affected by background noise or music, and subtitles ending in numbers may be truncated. Multilingual video transcription only retains the primary language. The tool cannot dub multiple characters separately due to unreliable speaker distinction in WhisperX. Dubbing quality may vary due to speech rate and intonation differences.

Health Check

Last Commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

287 stars in the last 30 days