whisper_real_time  by davabase

Demo for real-time speech-to-text using OpenAI's Whisper

created 2 years ago
2,797 stars

Top 17.4% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a real-time transcription system using OpenAI's Whisper model, targeting developers and researchers needing live speech-to-text capabilities. It enables continuous audio processing for immediate transcription output.

How It Works

The system operates by continuously recording audio in a separate thread. It concatenates raw audio bytes from multiple recordings to maintain a continuous stream for the Whisper model. This approach allows for near-instantaneous transcription of spoken words as they are detected.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires ffmpeg to be installed system-wide. Installation instructions are provided for Ubuntu/Debian, Arch Linux, macOS, and Windows.
  • Further information on Whisper is available at https://github.com/openai/whisper.

Highlighted Details

  • Real-time transcription of speech using OpenAI Whisper.
  • Continuous audio recording and concatenation for live processing.

Maintenance & Community

No information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The code is in the public domain, allowing for unrestricted use, modification, and distribution, including commercial applications.

Limitations & Caveats

The README does not detail specific performance benchmarks, hardware requirements beyond ffmpeg, or potential limitations regarding supported audio formats or languages. The project appears to be a demonstration with limited explicit support information.

Health Check
Last commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
117 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.