Stage-Whisper  by Stage-Whisper

Transcription app for journalists using OpenAI's Whisper ASR

created 2 years ago
257 stars

Top 98.8% on sourcepulse

GitHubView on GitHub
Project Summary

Stage-Whisper provides a free, secure, and user-friendly transcription application for journalists and other users, leveraging OpenAI's Whisper ASR model for accurate speech-to-text conversion. It offers a graphical interface for managing and editing transcriptions, aiming to democratize access to advanced ASR technology for less technical users.

How It Works

The application comprises a Python backend that interfaces with OpenAI's Whisper library and a Node/Electron-based frontend for the user interface. This architecture separates the core ASR processing from the user interaction layer, allowing for independent development and potential scalability. The backend handles audio processing and transcription, while the frontend provides a GUI for file management, editing, and user interaction.

Quick Start & Requirements

To develop Stage Whisper, users need Node, Yarn, Python 3.x, Rust, ffmpeg, and Poetry installed. A sample installation command for macOS is provided. The backend can be run standalone using poetry run python stagewhisper --input /path/to/audio/file.mp3 after poetry install in the backend directory. The Electron interface can be started with yarn and yarn dev in the electron directory.

Highlighted Details

  • Leverages OpenAI's Whisper ASR model, trained on 680,000 hours of multilingual data.
  • Features a simple and intuitive graphical user interface for managing and editing transcriptions.
  • Aims for cross-platform compatibility (macOS, Windows, Linux).
  • Currently in early development with a working prototype.

Maintenance & Community

The project is led by @PeterSterne, @filmgirl, @HarrisLapiroff, and @Crazy4Pi314, with @oenu leading frontend development. Feature requests and bug reports can be submitted via GitHub issues and discussions. A Discord server is available for project discussions.

Licensing & Compatibility

Stage Whisper-specific code is MIT licensed. OpenAI's Whisper is MIT licensed. The project will adhere to all licensing terms, including those for dependencies like FFmpeg.

Limitations & Caveats

The project is in the early stages of development, with a beta version planned for release soon. While a working prototype exists, users currently need to install multiple development dependencies.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.