VideoLingo  by Huanshere

AI tool for automated video translation, localization, and dubbing

created 11 months ago
14,072 stars

Top 3.6% on sourcepulse

GitHubView on GitHub
Project Summary

VideoLingo is an all-in-one AI-powered tool for video localization, offering automated subtitle generation, translation, alignment, and dubbing to create Netflix-quality content. It targets content creators and distributors aiming to overcome language barriers and expand their reach globally.

How It Works

VideoLingo leverages WhisperX for word-level subtitle recognition and alignment, followed by NLP for intelligent segmentation. Translations are enhanced with custom or AI-generated terminology for accuracy. A "Translate-Reflect-Adaptation" process refines output for cinematic quality, focusing exclusively on single-line subtitles. Dubbing is integrated using various TTS engines like GPT-SoVITS, Azure, and OpenAI.

Quick Start & Requirements

  • Install: Clone the repository, activate a Python 3.10 conda environment, and run python install.py.
  • Prerequisites: Python 3.10, FFmpeg (install via package manager), CUDA Toolkit 12.6 and CUDNN 9.3.0 for Windows NVIDIA users. Docker requires CUDA 12.4 and NVIDIA Driver >550.
  • Run: streamlit run st.py
  • Docs: English, 中文

Highlighted Details

  • Automated subtitle cutting, translation, alignment, and dubbing.
  • Generates Netflix-standard, single-line subtitles.
  • Supports multiple TTS engines including GPT-SoVITS for voice cloning.
  • One-click startup via Streamlit UI.

Maintenance & Community

  • Project is actively maintained by Huanshere.
  • Contact via GitHub Issues/PRs, Twitter (@Huanshere), or email (team@videolingo.io).

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Compatible with commercial use.

Limitations & Caveats

WhisperX transcription can be affected by background noise or music, and subtitles ending in numbers may be truncated. Multilingual video transcription only retains the primary language. The tool cannot dub multiple characters separately due to unreliable speaker distinction in WhisperX. Dubbing quality may vary due to speech rate and intonation differences.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
14
Star History
1,985 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.