openai-whisper-cpu by MiscellaneousStuff

Accelerating speech recognition on consumer CPUs

Created 3 years ago

259 stars

Top 97.8% on SourcePulse

Project Summary

This project offers experimental modifications to OpenAI's Whisper ASR model, applying dynamic quantization to enhance inference speed and throughput on CPU-only hardware. It targets users with consumer-grade laptops or desktops lacking dedicated GPUs, enabling faster transcription by making larger Whisper models more efficient.

How It Works

The approach modifies Whisper's Linear() layers to torch.nn.Linear() and applies dynamic quantization (torch.qint8) via torch.quantization.quantize_dynamic. This reduces model precision, decreasing computational load and memory bandwidth for CPU inference.

Quick Start & Requirements

Primary install / run command:
- Initialize and update submodules: git submodule init && git submodule update
- Install: pip install -e ./whisper
Non-default prerequisites and dependencies: Python, PyTorch. Designed for CPU inference.
Estimated setup time or resource footprint: Not specified.
Links to official quick-start, docs, demo, or other relevant pages: None provided.

Highlighted Details

Quantization yields significant CPU speedups for larger Whisper models: 1.62x (base), 2.76x (small), 2.62x (medium) over non-quantized CPU versions.
The tiny quantized model exhibits a 0.74x slowdown vs. original CPU fp32, indicating performance varies by model size.
Achieves real-time transcription: 9.67x (tiny), 9.37x (base), 4.34x (small), 1.29x (medium).

Maintenance & Community

No details on maintainers, community channels, or roadmap are available.

Licensing & Compatibility

License is unspecified, posing a potential blocker for commercial use or integration.

Limitations & Caveats

Experimental nature; quantization can degrade performance for smaller models (e.g., tiny is slower than original CPU fp32). Unspecified license is a significant adoption blocker.

openai-whisper-cpu by MiscellaneousStuff

Explore Similar Projects

onnx-asr by istupakov

voxtral-mini-realtime-rs by TrevorS

Lightning-SimulWhisper by altalt-org

ratchet by huggingface

fast-whisper-finetuning by Vaibhavs10

TheWhisper by TheStageAI

voxtral.c by antirez

TensorflowASR by Z-yq

smoothquant by mit-han-lab

DALI by NVIDIA

Whisper by Const-me

whisper.cpp by ggml-org