whisper_mic  by mallorbc

Microphone interface for OpenAI's Whisper speech-to-text model

created 2 years ago
771 stars

Top 46.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a Python package for integrating OpenAI's Whisper speech-to-text model with a microphone, enabling real-time transcription and dictation. It's designed for developers and users who need to incorporate voice input into their applications or use Whisper directly from their microphone without complex setup.

How It Works

The package leverages the Whisper model architecture to process audio input from a microphone. It handles audio capture, chunking, and feeding into the Whisper model for transcription. The project offers both a command-line interface for direct use and a Python API for programmatic integration, abstracting away the complexities of audio handling and model inference.

Quick Start & Requirements

  • Install via pip: pip install whisper-mic
  • Prerequisites: Python 3.x, portaudio19-dev (Linux), pyaudio.
  • GPU acceleration is supported by the underlying Whisper model, but not explicitly managed by this package's installation.
  • Official Docs: https://github.com/mallorbc/whisper_mic

Highlighted Details

  • Supports five Whisper model sizes (tiny, base, small, medium, large) with English-only variants.
  • Offers a command-line interface for live dictation to the active cursor (--loop --dictate).
  • Provides a Python API (WhisperMic().listen()) for easy integration into other projects.
  • Includes video tutorials for setup and usage.

Maintenance & Community

The project appears to be a personal initiative with limited information on community size or active development beyond the initial release. Paid professional assistance is offered via email.

Licensing & Compatibility

  • Both the code in this repository and the Whisper model weights are released under the MIT License.
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

The project relies on the underlying Whisper model's capabilities and limitations. Setup may require system-level audio development libraries. The project's community support and long-term maintenance status are not clearly indicated.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.