SmartSub  by buxuku

Cross-platform tool for batch generating & translating video/audio subtitles

created 1 year ago
2,707 stars

Top 17.9% on sourcepulse

GitHubView on GitHub
Project Summary

SmartSub is a cross-platform desktop application designed for batch subtitle generation and translation for audio and video files. It caters to users who need to efficiently create and localize subtitles, offering local processing for privacy and speed, and supporting a wide array of translation services and hardware acceleration.

How It Works

The tool leverages the Whisper model for accurate speech-to-text transcription, supporting various model sizes for different hardware capabilities. For translation, it integrates with multiple services including cloud-based APIs (Baidu, Volcano, OpenAI-compatible) and local Ollama models, providing flexibility in cost and privacy. Hardware acceleration via NVIDIA CUDA and Apple's Core ML is supported for faster processing.

Quick Start & Requirements

  • Installation: Download the appropriate release binary for your OS (Windows, macOS) from the releases page.
  • Prerequisites:
    • NVIDIA GPUs require CUDA Toolkit (versions 11.8.0, 12.2.0, or 12.4.1 are specifically mentioned for compiled binaries).
    • Apple Silicon Macs benefit from Core ML acceleration (download mac-arm64 release).
    • Optional: Local Whisper models and Ollama for offline transcription/translation.
  • Setup: Download and run the application. Model downloads and translation service configuration are handled within the app.
  • Links: Releases, CUDA Download, Hugging Face Whisper Models.

Highlighted Details

  • Supports batch subtitle generation and translation for various audio/video formats.
  • Integrates with multiple translation services: Volcano, Baidu, Microsoft, DeepLX, Ollama, DeepSeek, DeerAPI, and OpenAI-compatible APIs.
  • Hardware acceleration: NVIDIA CUDA (Windows/Linux) and Apple Core ML (macOS M-series).
  • Customizable subtitle file naming and content (translation only or original + translation).

Maintenance & Community

The project is actively maintained by buxuku. Community support is available via WeChat groups.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

CUDA support is primarily tested via GitHub Actions, and users may encounter environment compatibility issues. Batch translation via some cloud services might be subject to rate limiting. The accuracy and speed of transcription and translation depend heavily on the chosen Whisper model and translation service.

Health Check
Last commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
7
Star History
488 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.