pywhispercpp by absadiki

Python bindings for whisper.cpp, enabling local speech transcription

Created 2 years ago

313 stars

Top 86.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgi Gerganov

Author of llama.cpp, whisper.cpp

Project Summary

This project provides Python bindings for whisper.cpp, enabling efficient, on-device speech-to-text transcription. It targets developers and researchers needing to integrate robust ASR capabilities into Python applications, offering a simpler API over the core C++ library and supporting various hardware acceleration backends.

How It Works

The library wraps the whisper.cpp C++ library, exposing its functionality through a Pythonic interface. It leverages GGML for efficient tensor operations and model execution, allowing for CPU and GPU (CUDA, CoreML, Vulkan) acceleration. Users can load whisper models, transcribe audio files, and access various parameters for fine-tuning the transcription process.

Quick Start & Requirements

Install CPU version: pip install pywhispercpp
Install from source for best performance: pip install git+https://github.com/absadiki/pywhispercpp
GPU support requires CUDA, CoreML, or Vulkan installation.
ffmpeg is required for non-WAV audio files.
Official documentation: https://github.com/absadiki/pywhispercpp

Highlighted Details

Supports multiple hardware acceleration backends: CUDA, CoreML, Vulkan, OpenBLAS, and OpenVINO.
Provides a Command Line Interface (CLI) tool (pwcpp) for direct audio file transcription.
Includes an Assistant example demonstrating real-time transcription with Voice Activity Detection (VAD).
Allows direct access to underlying C-APIs for advanced usage.

Maintenance & Community

Contributions are welcomed via issues and discussions on GitHub.
Project discussions can be found at: https://github.com/absadiki/pywhispercpp/discussions

Licensing & Compatibility

MIT License, consistent with whisper.cpp.
Permissive license suitable for commercial and closed-source applications.

Limitations & Caveats

The print_realtime option is noted as potentially problematic, recommending the use of callbacks instead. The OpenVINO support notes compatibility with Ubuntu22 toolkit on Ubuntu24, suggesting potential version-specific issues.

Health Check

Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days