MioSub  by corvo007

AI-powered subtitle generation and translation for video and audio

Created 3 months ago
354 stars

Top 79.1% on SourcePulse

GitHubView on GitHub
Project Summary

MioSub is a one-stop, automated subtitle generator for content creators and localization professionals. It handles downloading, transcription, translation, and hardcoding with zero human intervention, significantly reducing subtitle creation time and effort for diverse media.

How It Works

MioSub employs AI for core functions: CTC alignment for millisecond-precision timing and Google Gemini for 100+ language transcription/translation. It supports OpenAI Whisper (local or whisper.cpp options) for speech-to-text. The fully automated workflow accepts links or files, processing them for speaker diarization and term extraction.

Quick Start & Requirements

  • Installation: Downloadable executables for Windows (64-bit), macOS (12+), and Linux (x64/arm64 AppImage).
  • Prerequisites: 4GB+ RAM, network. Requires Google Gemini API key (Gemini 2.5/3 Flash/Pro) and optional endpoint. Local Whisper needs extra config.
  • Setup: Paste video/audio link or upload file after API key configuration.
  • Local Development: Node.js 18+.

Highlighted Details

  • Fully automated workflow from input to finished subtitles.
  • Millisecond-precision time-axis alignment via built-in CTC aligner.
  • 100+ language support for transcription/translation, with automatic term extraction and speaker diarization.
  • Handles video and pure audio files (podcasts, audiobooks).
  • Integrated editor with real-time preview and SRT/ASS import/export.

Maintenance & Community

Actively developed (v3.0), encourages contributions via issues/PRs. Key dependencies: Google Gemini, OpenAI Whisper, yt-dlp, FFmpeg. No specific community links or roadmap mentioned.

Licensing & Compatibility

Licensed under AGPL-3.0. This strong copyleft license requires derivative works to also be AGPL-3.0, potentially restricting closed-source commercial integration.

Limitations & Caveats

Requires a Google Gemini API key (potential costs). Advanced configurations (e.g., local Whisper) need external documentation. AGPL-3.0 license may restrict closed-source commercial use.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
6
Star History
195 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.