Speech-to-text models optimized for fast, accurate ASR on edge devices
Top 17.4% on sourcepulse
Moonshine is a family of automatic speech recognition (ASR) models designed for fast and accurate transcription on resource-constrained edge devices. It targets real-time applications like live captioning and voice commands, offering competitive word-error rates (WER) compared to similarly sized OpenAI Whisper models.
How It Works
Moonshine employs an architecture optimized for efficient processing of audio, notably handling input audio segments dynamically rather than fixed 30-second chunks. This approach allows for significantly faster processing of shorter audio inputs, with a claimed 5x speed improvement over Whisper for 10-second segments while maintaining comparable or better accuracy.
Quick Start & Requirements
uv pip install useful-moonshine
(for Torch, TensorFlow, JAX) or uv pip install useful-moonshine-onnx
(for ONNX).moonshine.transcribe('audio.wav', 'moonshine/tiny')
UsefulSensors/moonshine-tiny
, UsefulSensors/moonshine-base
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 months ago
1 day