whisper-ctranslate2 by Softcatala

CLI tool for faster Whisper transcription/translation

Created 2 years ago

1,183 stars

Top 32.8% on SourcePulse

Project Summary

This project provides a command-line interface for the Whisper speech-to-text model, optimized for performance using CTranslate2. It targets users who need faster and more memory-efficient transcription and translation than the original OpenAI Whisper implementation, offering a seamless migration path.

How It Works

The client leverages the CTranslate2 library, a fast inference engine for Transformer models, to run Whisper. This approach enables significant speedups (up to 4x) and reduced memory usage by employing optimized kernels and quantization techniques (INT8, FP16). It supports batched inference for further performance gains and integrates a Voice Activity Detection (VAD) filter for improved processing of speech segments.

Quick Start & Requirements

Install: pip install -U whisper-ctranslate2
Docker: docker pull ghcr.io/softcatala/whisper-ctranslate2:latest
GPU support requires NVIDIA cuBLAS 11.x and cuDNN 8.x.
CPU support includes x86-64 and ARM64 with various backends.
Documentation: https://github.com/Softcatala/whisper-ctranslate2

Highlighted Details

Up to 4x faster and uses less memory than OpenAI Whisper.
Supports transcription and translation (to English).
Options for batched inference, quantization (--compute_type), VAD filtering, and live microphone transcription.
Experimental diarization support via pyannote.audio requires Hugging Face token and specific model acceptances.

Maintenance & Community

Project contact: Jordi Mas (jmas@softcatala.org).
Related project: Open dubbing (https://github.com/jmas/open-dubbing).

Licensing & Compatibility

License: MIT.
Compatible with commercial use.

Limitations & Caveats

Translation is currently limited to English as the target language. Experimental diarization requires manual setup and acceptance of third-party model terms.

whisper-ctranslate2 by Softcatala

Explore Similar Projects

insanely-fast-whisper-cli by ochen1

Auralis by astramind-ai

LiveWhisper by Nikorasu

WhisperS2T by shashikg

awesome-whisper by sindresorhus

transcribe-anything by zackees

whisper-standalone-win by Purfview

faster-whisper-GUI by CheshireCC

whisper_mic by mallorbc

stable-ts by jianfch

WhisperSpeech by WhisperSpeech

faster-whisper by SYSTRAN