Lightning-SimulWhisper by altalt-org

Accelerated local speech transcription for Apple Silicon

Created 4 months ago

483 stars

Top 63.6% on SourcePulse

Project Summary

Summary Lightning-SimulWhisper delivers high-performance, real-time local speech-to-text transcription optimized for Apple Silicon devices. It addresses the demand for efficient on-device processing by integrating MLX and CoreML, yielding substantial speed gains and enhanced power efficiency over standard PyTorch solutions. This project benefits users requiring rapid, responsive transcription without cloud dependencies.

How It Works The project employs a hybrid architecture for Whisper transcription on Apple Silicon. CoreML accelerates encoder inference via the Apple Neural Engine, achieving up to 18x speedups and reduced power draw. MLX manages the decoder, offering up to 15x speed improvements over PyTorch and supporting the AlignAtt policy for simultaneous decoding and configurable beam search. This dual-framework approach optimizes both transcription speed and system efficiency.

Quick Start & Requirements

Installation: Basic setup: pip install -r requirements.txt. For CoreML acceleration: pip install coremltools ane_transformers.
Prerequisites: Apple Silicon hardware is mandatory. CoreML acceleration requires cloning whisper.cpp and generating CoreML encoder models via ./scripts/generate_coreml_encoder.sh <model_name>.
Resource Footprint: Significant speedups are claimed, particularly with CoreML, suggesting efficient per-transcription resource utilization.
Documentation: Usage examples and detailed command-line arguments are available in the README.

Highlighted Details

Achieves up to 18x encoder and 15x decoder speed increases compared to PyTorch.
CoreML acceleration provides notably lower power consumption than MLX-only configurations.
Supports a comprehensive range of Whisper models, from tiny to large-v3-turbo.
Features the AlignAtt policy for advanced simultaneous streaming speech recognition.

Maintenance & Community Details regarding maintainers, community channels (e.g., Discord, Slack), or a public roadmap are not provided in the README.

Licensing & Compatibility The specific open-source license governing this project is not explicitly stated in the provided README. Clarification is needed regarding commercial use or closed-source linking compatibility.

Lightning-SimulWhisper by altalt-org

Explore Similar Projects

openai-whisper-cpu by MiscellaneousStuff

qwen3-tts-apple-silicon by kapi2800

MiraTTS by ysharma3501

TheWhisper by TheStageAI

whisper.rn by mybigday

kani-tts by nineninesix-ai

VITA-Audio by VITA-MLLM

voxtral.c by antirez

soprano by ekwek1

transcribe-anything by zackees

supertonic by supertone-inc

insanely-fast-whisper by Vaibhavs10