faster-whisper  by SYSTRAN

Faster Whisper reimplementation using CTranslate2

Created 2 years ago
18,160 stars

Top 2.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a significantly faster and more memory-efficient implementation of OpenAI's Whisper speech-to-text model, leveraging the CTranslate2 inference engine. It targets developers and researchers needing high-throughput transcription, offering up to 4x speed improvements and reduced resource consumption, especially with 8-bit quantization.

How It Works

Faster-Whisper reimplements the Whisper architecture using CTranslate2, a specialized C++ inference engine optimized for Transformer models. This allows for efficient execution on both CPU and GPU, with particular benefits from 8-bit quantization, which drastically reduces memory usage and speeds up computation without significant accuracy loss.

Quick Start & Requirements

  • Install: pip install faster-whisper
  • Prerequisites:
    • Python 3.9+
    • GPU: CUDA 12, cuBLAS, cuDNN 9 (or specific older versions of CTranslate2 for compatibility). Installation via Docker or pip on Linux is supported.
    • CPU: No specific hardware requirements beyond standard CPU.
  • Setup: Minimal setup for basic Python usage. GPU setup requires NVIDIA driver and library installation.
  • Docs: https://github.com/SYSTRAN/faster-whisper

Highlighted Details

  • Up to 4x faster than openai/whisper on GPU (FP16) and significantly faster on CPU (INT8).
  • Supports 8-bit quantization on CPU and GPU for reduced memory footprint (e.g., 2926MB VRAM for Large-v2 INT8 vs. 4708MB FP16).
  • Batch transcription support for increased throughput.
  • Integrates Silero VAD for optional silence filtering.
  • Provides word-level timestamps.

Maintenance & Community

Licensing & Compatibility

  • MIT License. Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

  • GPU execution strictly requires CUDA 12 and cuDNN 9, with specific workarounds needed for older CUDA/cuDNN versions by downgrading ctranslate2.
  • CPU benchmarks are provided for specific hardware (Intel Core i7-12700K), performance may vary.
Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
11
Star History
531 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.