Discover and explore top open-source AI tools and projects—updated daily.
AI dubbing system for videos
Top 90.6% on SourcePulse
Open Dubbing is an experimental command-line AI system for automatically translating and synchronizing video dialogue into different languages. It's designed for users interested in understanding and experimenting with the integration of Speech-to-Text (STT), Text-to-Speech (TTS), and machine translation technologies for video localization.
How It Works
This system orchestrates a pipeline of open-source models for STT (Whisper), translation (NLLB-200, Apertium API), and TTS (Coqui, MMS, Edge, OpenAI). It supports automatic source language detection and offers configurable voice gender assignment for synthetic voices. The approach leverages established models to provide a flexible, locally runnable dubbing solution.
Quick Start & Requirements
pip install open_dubbing
(add [coqui]
or [openai]
for specific TTS support).ffmpeg
(system-wide installation required for Linux, macOS, Windows). espeak-ng
is needed for Coqui-TTS on Linux/macOS. Hugging Face token required for model access. Pyannote.audio user conditions must be accepted.open-dubbing --input_file video.mp4 --target_language=cat --hugging_face_token=TOKEN
Highlighted Details
Maintenance & Community
The project is developed by Softcatalà. Contact: Jordi Mas (jmas@softcatala.org).
Licensing & Compatibility
The project appears to be under a permissive license, but core libraries used (like pyannote.audio) may have their own terms. Commercial use should be verified against all dependencies.
Limitations & Caveats
This is an experimental project, and errors can occur at any stage of the pipeline (speech recognition, translation, TTS). Language support is dependent on the specific combination of TTS, translation, and STT models used.
2 months ago
1 day