gemini-srt-translator  by MaKTaiL

Subtitle translation powered by Google Gemini AI

Created 1 year ago
264 stars

Top 96.8% on SourcePulse

GitHubView on GitHub
Project Summary

A Python tool that leverages Google Gemini AI to translate SRT subtitle files, designed for users needing efficient and accurate subtitle localization for videos, movies, and series. It preserves original timestamps and formatting, offering both command-line and programmatic interfaces for flexibility and automation.

How It Works

This tool utilizes Google Gemini AI models, including advanced reasoning capabilities in Gemini 2.5, to translate SRT subtitle files. It meticulously maintains the original timestamps and basic SRT formatting, ensuring synchronized playback. The approach supports extracting subtitles and audio context directly from video files (requiring FFmpeg) to enhance translation accuracy, and allows users to provide custom descriptions to guide the AI on specific terminology or context.

Quick Start & Requirements

  • Installation: pip install --upgrade gemini-srt-translator. A virtual environment is recommended.
  • Prerequisites: A Google Gemini API key is mandatory. FFmpeg is required for video/audio extraction features.
  • API Key Setup: Can be set via environment variable (GEMINI_API_KEY), command-line argument (-k), or directly in Python code (gst.gemini_api_key).
  • Resources: Obtain API keys from Google AI Studio.

Highlighted Details

  • Full command-line interface (CLI) and Python API support.
  • Automatic SRT subtitle extraction and translation from video files.
  • Audio extraction from video/audio files for improved translation context.
  • Advanced AI features, including contextual reasoning and 'thinking' capability (Gemini 2.5+).
  • Customizable translation parameters (model, batch size, temperature, description).
  • Quick resume functionality for interrupted translations.
  • Optional progress and thinking process logging.

Maintenance & Community

The project is maintained by MaKTaiL, with contributions from several individuals listed in the repository. No specific community channels (like Discord/Slack) or roadmaps are linked in the README.

Licensing & Compatibility

Distributed under the MIT License, which generally permits commercial use and modification.

Limitations & Caveats

Video and audio processing features are dependent on the FFmpeg installation. The advanced 'thinking' capability is exclusive to Gemini 2.5 models. An active Google Gemini API key is required for operation.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
10
Star History
15 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.