WAAS by schibsted

Whisper-as-a-Service provides GUI/API access to OpenAI's Whisper model

Created 3 years ago

2,027 stars

Top 21.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Matt Schrage

Cofounder of Fig

Project Summary

This project provides a scalable, queue-based service for transcribing audio using OpenAI's Whisper model, accessible via a REST API and a local web GUI. It targets developers and users needing efficient, high-volume audio transcription, offering features like email notifications, webhook callbacks, and an in-browser editor for transcription correction.

How It Works

The system utilizes a Flask API for handling transcription requests, which are then placed into a Redis queue managed by RQ workers. These workers process the audio files, leveraging OpenAI's Whisper models for transcription. The architecture supports GPU acceleration via Docker for faster processing and includes a local JavaScript-based editor for manual transcription refinement.

Quick Start & Requirements

Installation: pip install -r requirements.txt (within a Python virtual environment).
Prerequisites: Python 3.8-3.10, Redis, and optionally NVIDIA GPU with CUDA for accelerated processing.
Docker Setup: Requires a .envrc file with API keys and webhook configurations, and docker-compose --env-file .envrc up.
GPU Support: Requires nvidia-docker and modifications to docker-compose.yml to enable the NVIDIA runtime.
Docs: API documentation is embedded within the README.

Highlighted Details

REST API for asynchronous transcription jobs.
Local web GUI with an in-browser editor for transcription correction.
Supports email notifications and webhook callbacks for job completion.
Offers various output formats: JSON, SRT, VTT, TXT.
Language detection endpoint available.

Maintenance & Community

The project is from Schibsted, a Norwegian media and technology company. Specific community channels or active development signals are not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project explicitly states "No video support" in the description, despite mentioning video file uploads in the GUI section. The README does not specify a license, which could impact adoption.

Health Check

Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days