Whisper-as-a-Service provides GUI/API access to OpenAI's Whisper model
Top 23.2% on sourcepulse
This project provides a scalable, queue-based service for transcribing audio using OpenAI's Whisper model, accessible via a REST API and a local web GUI. It targets developers and users needing efficient, high-volume audio transcription, offering features like email notifications, webhook callbacks, and an in-browser editor for transcription correction.
How It Works
The system utilizes a Flask API for handling transcription requests, which are then placed into a Redis queue managed by RQ workers. These workers process the audio files, leveraging OpenAI's Whisper models for transcription. The architecture supports GPU acceleration via Docker for faster processing and includes a local JavaScript-based editor for manual transcription refinement.
Quick Start & Requirements
pip install -r requirements.txt
(within a Python virtual environment)..envrc
file with API keys and webhook configurations, and docker-compose --env-file .envrc up
.nvidia-docker
and modifications to docker-compose.yml
to enable the NVIDIA runtime.Highlighted Details
Maintenance & Community
The project is from Schibsted, a Norwegian media and technology company. Specific community channels or active development signals are not detailed in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project explicitly states "No video support" in the description, despite mentioning video file uploads in the GUI section. The README does not specify a license, which could impact adoption.
1 week ago
1 day