Subtitle generator for offline bilingual transcription
Top 74.6% on sourcepulse
This project provides a one-click solution for generating bilingual subtitles using Faster-Whisper and ModelScope, targeting users who need to create dual-language subtitles from audio/video files. It leverages offline large models for translation, offering a convenient and potentially faster alternative to cloud-based services.
How It Works
The system integrates Faster-Whisper for accurate speech-to-text transcription and ModelScope, an open-source platform for large models, for translation. This combination allows for offline processing, reducing reliance on external APIs and potentially improving privacy and speed. The workflow likely involves transcribing audio with Faster-Whisper and then translating the transcribed text using a ModelScope translation model.
Quick Start & Requirements
conda create -n venv python=3.9
, conda activate venv
) and install dependencies (pip install -r requirements.txt
, pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
).python3.9-distutils
, libsox-dev
(Ubuntu/Debian). Requires downloading the whisper-large-v3-turbo
model. Ollama is used for conversational translation models (e.g., ollama run qwen2:7b
).python3 app.py
.Highlighted Details
Maintenance & Community
The project credits faster-whisper
and Csanmt
. Further community or maintenance details are not provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. The use of Faster-Whisper and ModelScope implies adherence to their respective licenses. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is described as supporting specific bilingual subtitle types, indicating that other language pairs may not be supported. The "off-line large model" aspect suggests a significant local resource requirement for the models.
8 months ago
1 day