CLI tool for batch speech-to-text using Whisper
Top 90.0% on sourcepulse
Whisperer is a batch speech-to-text tool designed for generating subtitles from video and audio files. It leverages OpenAI's Whisper model, specifically the GPU-accelerated whisper.cpp
implementation, to process multiple files concurrently, scaling with available GPU memory.
How It Works
Whisperer utilizes whisper.cpp
for efficient, GPU-accelerated inference of the Whisper model. It supports batch processing, launching multiple instances of the model to maximize throughput based on the user's GPU memory capacity. This approach offers significant speed improvements over CPU-based processing.
Quick Start & Requirements
pip install whisperer
.ffmpeg
to be in the system's PATH.ggerganov/whisper.cpp/tree/main
), ensuring not to use v3 models as they are not yet supported.Highlighted Details
whisper.cpp
for significant GPU-accelerated speed improvements.Maintenance & Community
No specific community channels or roadmap are mentioned in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.
Limitations & Caveats
The project explicitly states that v3 Whisper models are not yet supported. ffmpeg
is a required external dependency not included with the package.
4 months ago
Inactive