Subtitle generator for media servers
Top 40.6% on sourcepulse
This project provides an automated subtitle generation service for media files, leveraging OpenAI's Whisper model. It integrates with popular media servers like Jellyfin, Plex, and Emby, as well as Bazarr, to automatically create SRT or LRC subtitle files from audio or video content. The primary benefit is ensuring all media has subtitles, catering to users who require them for accessibility or preference.
How It Works
The service operates via webhooks triggered by media server events (e.g., new media added, playback started) or through direct integration with Bazarr. It utilizes faster-whisper
and stable-ts
for transcription, offering flexibility in model selection (from tiny
to large-v3-turbo
) and compute device (CPU or GPU via CUDA). The system can transcribe audio in its original language or translate it to English, with extensive configuration options for language detection, skipping existing subtitles, and preferred audio tracks.
Quick Start & Requirements
mccloud/subgen:latest
or mccloud/subgen:cpu
.Highlighted Details
large-v3-turbo
and distil
variants.MONITOR
for folder watching, LRC_FOR_AUDIO_FILES
for audio-specific formats, and advanced subtitle regrouping.Maintenance & Community
The project is actively maintained, with frequent updates addressing bugs and adding features. Community support and discussions are available via GitHub Discussions.
Licensing & Compatibility
The project is released under an unspecified license. Compatibility for commercial use or closed-source linking is not explicitly stated.
Limitations & Caveats
The project is developed by an individual without formal deployment experience, and the accuracy of transcriptions is dependent on the AI model's performance. Some features, like the web UI, have been removed in favor of environment variable configuration.
2 months ago
1 day