whisper_real_time by davabase

Demo for real-time speech-to-text using OpenAI's Whisper

Created 3 years ago

2,906 stars

Top 16.3% on SourcePulse

View on GitHub

3 Experts Love This Project

Pietro Schirano

Founder of MagicPath

Luis Capelo

Cofounder of Lightning AI

Didier Lopes

Founder of OpenBB

Project Summary

This project provides a real-time transcription system using OpenAI's Whisper model, targeting developers and researchers needing live speech-to-text capabilities. It enables continuous audio processing for immediate transcription output.

How It Works

The system operates by continuously recording audio in a separate thread. It concatenates raw audio bytes from multiple recordings to maintain a continuous stream for the Whisper model. This approach allows for near-instantaneous transcription of spoken words as they are detected.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires ffmpeg to be installed system-wide. Installation instructions are provided for Ubuntu/Debian, Arch Linux, macOS, and Windows.
Further information on Whisper is available at https://github.com/openai/whisper.

Highlighted Details

Real-time transcription of speech using OpenAI Whisper.
Continuous audio recording and concatenation for live processing.

Maintenance & Community

No information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The code is in the public domain, allowing for unrestricted use, modification, and distribution, including commercial applications.

Limitations & Caveats

The README does not detail specific performance benchmarks, hardware requirements beyond ffmpeg, or potential limitations regarding supported audio formats or languages. The project appears to be a demonstration with limited explicit support information.

Health Check

Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days