whisper_mic by mallorbc

Microphone interface for OpenAI's Whisper speech-to-text model

Created 3 years ago

783 stars

Top 44.9% on SourcePulse

Project Summary

This project provides a Python package for integrating OpenAI's Whisper speech-to-text model with a microphone, enabling real-time transcription and dictation. It's designed for developers and users who need to incorporate voice input into their applications or use Whisper directly from their microphone without complex setup.

How It Works

The package leverages the Whisper model architecture to process audio input from a microphone. It handles audio capture, chunking, and feeding into the Whisper model for transcription. The project offers both a command-line interface for direct use and a Python API for programmatic integration, abstracting away the complexities of audio handling and model inference.

Quick Start & Requirements

Install via pip: pip install whisper-mic
Prerequisites: Python 3.x, portaudio19-dev (Linux), pyaudio.
GPU acceleration is supported by the underlying Whisper model, but not explicitly managed by this package's installation.
Official Docs: https://github.com/mallorbc/whisper_mic

Highlighted Details

Supports five Whisper model sizes (tiny, base, small, medium, large) with English-only variants.
Offers a command-line interface for live dictation to the active cursor (--loop --dictate).
Provides a Python API (WhisperMic().listen()) for easy integration into other projects.
Includes video tutorials for setup and usage.

Maintenance & Community

The project appears to be a personal initiative with limited information on community size or active development beyond the initial release. Paid professional assistance is offered via email.

Licensing & Compatibility

Both the code in this repository and the Whisper model weights are released under the MIT License.
Compatible with commercial and closed-source applications.

Limitations & Caveats

The project relies on the underlying Whisper model's capabilities and limitations. Setup may require system-level audio development libraries. The project's community support and long-term maintenance status are not clearly indicated.

whisper_mic by mallorbc

Explore Similar Projects

susi_translator by susiai

LiveWhisper by Nikorasu

ollama-voice-mac by apeatling

Speech-Translate by Dadangdut33

speech-to-text by reriiasu

whisper-writer by savbell

voice-assistant by linyiLYi

mini-omni by gpt-omni

whisper_real_time by davabase

WhisperLive by collabora

RealtimeSTT by KoljaB

piper by rhasspy