whisper_real_time  by davabase

Demo for real-time speech-to-text using OpenAI's Whisper

Created 2 years ago
2,846 stars

Top 16.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a real-time transcription system using OpenAI's Whisper model, targeting developers and researchers needing live speech-to-text capabilities. It enables continuous audio processing for immediate transcription output.

How It Works

The system operates by continuously recording audio in a separate thread. It concatenates raw audio bytes from multiple recordings to maintain a continuous stream for the Whisper model. This approach allows for near-instantaneous transcription of spoken words as they are detected.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires ffmpeg to be installed system-wide. Installation instructions are provided for Ubuntu/Debian, Arch Linux, macOS, and Windows.
  • Further information on Whisper is available at https://github.com/openai/whisper.

Highlighted Details

  • Real-time transcription of speech using OpenAI Whisper.
  • Continuous audio recording and concatenation for live processing.

Maintenance & Community

No information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The code is in the public domain, allowing for unrestricted use, modification, and distribution, including commercial applications.

Limitations & Caveats

The README does not detail specific performance benchmarks, hardware requirements beyond ffmpeg, or potential limitations regarding supported audio formats or languages. The project appears to be a demonstration with limited explicit support information.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
25 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Travis Fischer Travis Fischer(Founder of Agentic).

RealtimeSTT by KoljaB

0.5%
9k
Speech-to-text library for realtime applications
Created 2 years ago
Updated 2 months ago
Feedback? Help us improve.