subsai  by absadiki

Subtitle generation tool (Web-UI + CLI + Python package) using Whisper

created 2 years ago
1,535 stars

Top 27.6% on sourcepulse

GitHubView on GitHub
Project Summary

Subs AI is a comprehensive tool for generating subtitles from audio and video files, targeting developers and end-users alike. It leverages OpenAI's Whisper and its variants, offering transcription, translation, and subtitle manipulation capabilities through a Web UI, CLI, and Python package.

How It Works

The project integrates multiple Whisper implementations, including OpenAI's original, whisper-timestamped for word-level timestamps, whisper.cpp for CPU inference, faster-whisper for optimized performance (up to 4x faster with quantization), and whisperX for high-speed transcription with word-level timestamps and speaker diarization. It also supports stable-ts for timestamp stabilization and Hugging Face Transformers for broader model compatibility. Translation is handled by NLLB and mBART models.

Quick Start & Requirements

  • Install via pip: pip install git+https://github.com/absadiki/subsai
  • Requires ffmpeg to be installed.
  • Recommended Python versions: 3.10 or 3.11 (3.12+ may have issues).
  • GPU acceleration requires PyTorch with CUDA support.
  • Official Docs: https://github.com/absadiki/subsai

Highlighted Details

  • Supports multiple Whisper backends for transcription speed and accuracy trade-offs.
  • Includes word-level timestamps and speaker diarization via whisperX.
  • Offers subtitle translation using various Facebook models.
  • Provides auto-sync functionality with ffsubsync.
  • Supports multiple subtitle formats (.srt, .vtt, .ass, etc.).

Maintenance & Community

  • Active development is indicated by recent commits and issue tracking.
  • Community interaction is encouraged via GitHub issues.

Licensing & Compatibility

  • Licensed under GNU General Public License v3.0 or later.
  • GPLv3 is a strong copyleft license, requiring derivative works to also be open-sourced under GPLv3. This may restrict commercial use or linking with closed-source applications.

Limitations & Caveats

Python 3.12+ may have compatibility issues. The GPLv3 license imposes significant restrictions on commercial use and integration with proprietary software.

Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
82 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.