Python bindings for whisper.cpp, enabling local speech transcription
Top 94.3% on sourcepulse
This project provides Python bindings for whisper.cpp, enabling efficient, on-device speech-to-text transcription. It targets developers and researchers needing to integrate robust ASR capabilities into Python applications, offering a simpler API over the core C++ library and supporting various hardware acceleration backends.
How It Works
The library wraps the whisper.cpp C++ library, exposing its functionality through a Pythonic interface. It leverages GGML for efficient tensor operations and model execution, allowing for CPU and GPU (CUDA, CoreML, Vulkan) acceleration. Users can load whisper models, transcribe audio files, and access various parameters for fine-tuning the transcription process.
Quick Start & Requirements
pip install pywhispercpp
pip install git+https://github.com/absadiki/pywhispercpp
ffmpeg
is required for non-WAV audio files.Highlighted Details
pwcpp
) for direct audio file transcription.Assistant
example demonstrating real-time transcription with Voice Activity Detection (VAD).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The print_realtime
option is noted as potentially problematic, recommending the use of callbacks instead. The OpenVINO support notes compatibility with Ubuntu22 toolkit on Ubuntu24, suggesting potential version-specific issues.
1 month ago
Inactive