Stage-Whisper by Stage-Whisper

Transcription app for journalists using OpenAI's Whisper ASR

Created 3 years ago

259 stars

Top 97.9% on SourcePulse

Project Summary

Stage-Whisper provides a free, secure, and user-friendly transcription application for journalists and other users, leveraging OpenAI's Whisper ASR model for accurate speech-to-text conversion. It offers a graphical interface for managing and editing transcriptions, aiming to democratize access to advanced ASR technology for less technical users.

How It Works

The application comprises a Python backend that interfaces with OpenAI's Whisper library and a Node/Electron-based frontend for the user interface. This architecture separates the core ASR processing from the user interaction layer, allowing for independent development and potential scalability. The backend handles audio processing and transcription, while the frontend provides a GUI for file management, editing, and user interaction.

Quick Start & Requirements

To develop Stage Whisper, users need Node, Yarn, Python 3.x, Rust, ffmpeg, and Poetry installed. A sample installation command for macOS is provided. The backend can be run standalone using poetry run python stagewhisper --input /path/to/audio/file.mp3 after poetry install in the backend directory. The Electron interface can be started with yarn and yarn dev in the electron directory.

Highlighted Details

Leverages OpenAI's Whisper ASR model, trained on 680,000 hours of multilingual data.
Features a simple and intuitive graphical user interface for managing and editing transcriptions.
Aims for cross-platform compatibility (macOS, Windows, Linux).
Currently in early development with a working prototype.

Maintenance & Community

The project is led by @PeterSterne, @filmgirl, @HarrisLapiroff, and @Crazy4Pi314, with @oenu leading frontend development. Feature requests and bug reports can be submitted via GitHub issues and discussions. A Discord server is available for project discussions.

Licensing & Compatibility

Stage Whisper-specific code is MIT licensed. OpenAI's Whisper is MIT licensed. The project will adhere to all licensing terms, including those for dependencies like FFmpeg.

Limitations & Caveats

The project is in the early stages of development, with a beta version planned for release soon. While a working prototype exists, users currently need to install multiple development dependencies.

Stage-Whisper by Stage-Whisper

Explore Similar Projects

LiveWhisper by Nikorasu

AudioToText by Carleslc

transcriber_app by davabase

whispo by egoist

speech-to-text by reriiasu

Scriberr by rishikanthc

writeout.ai by beyondcode

noScribe by kaixxx

WhisperLive by collabora

podcastfy by souzatharsis

ecoute by SevaSk

WhisperLiveKit by QuentinFuxa