WAAS  by schibsted

Whisper-as-a-Service provides GUI/API access to OpenAI's Whisper model

created 2 years ago
1,916 stars

Top 23.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a scalable, queue-based service for transcribing audio using OpenAI's Whisper model, accessible via a REST API and a local web GUI. It targets developers and users needing efficient, high-volume audio transcription, offering features like email notifications, webhook callbacks, and an in-browser editor for transcription correction.

How It Works

The system utilizes a Flask API for handling transcription requests, which are then placed into a Redis queue managed by RQ workers. These workers process the audio files, leveraging OpenAI's Whisper models for transcription. The architecture supports GPU acceleration via Docker for faster processing and includes a local JavaScript-based editor for manual transcription refinement.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt (within a Python virtual environment).
  • Prerequisites: Python 3.8-3.10, Redis, and optionally NVIDIA GPU with CUDA for accelerated processing.
  • Docker Setup: Requires a .envrc file with API keys and webhook configurations, and docker-compose --env-file .envrc up.
  • GPU Support: Requires nvidia-docker and modifications to docker-compose.yml to enable the NVIDIA runtime.
  • Docs: API documentation is embedded within the README.

Highlighted Details

  • REST API for asynchronous transcription jobs.
  • Local web GUI with an in-browser editor for transcription correction.
  • Supports email notifications and webhook callbacks for job completion.
  • Offers various output formats: JSON, SRT, VTT, TXT.
  • Language detection endpoint available.

Maintenance & Community

The project is from Schibsted, a Norwegian media and technology company. Specific community channels or active development signals are not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project explicitly states "No video support" in the description, despite mentioning video file uploads in the GUI section. The README does not specify a license, which could impact adoption.

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.