whisper-burn  by Gadersd

Rust transcription tool using OpenAI's Whisper model

Created 2 years ago
322 stars

Top 84.3% on SourcePulse

GitHubView on GitHub
Project Summary

Whisper Burn offers a Rust implementation of OpenAI's Whisper speech-to-text model, targeting developers seeking native Rust performance and control. It leverages the Burn deep learning framework, enabling efficient inference without Python dependencies.

How It Works

This project translates the Whisper architecture into Rust, utilizing Burn's tensor operations and backend flexibility (e.g., tch for LibTorch, wgpu for GPU acceleration). This approach aims for lower overhead and potentially faster execution compared to Python-based solutions, especially in resource-constrained environments or when integrating into existing Rust applications.

Quick Start & Requirements

  • Install: Clone the repository (git clone https://github.com/Gadersd/whisper-burn.git).
  • Prerequisites: Rust toolchain, wget, sox (for audio resampling). For wgpu backend, a compatible GPU and drivers are needed.
  • Model: Download pre-converted Burn-format models from Hugging Face (e.g., tiny_en).
  • Run: cargo run --release --bin transcribe <model_name> <audio_file> <language> <output_file>. Example: cargo run --release --bin transcribe tiny_en audio16k.wav en transcription.txt.
  • Audio: Input audio must be 16kHz, single-channel. Resample using sox audio.wav -r 16000 -c 1 audio16k.wav.
  • Docs: Hugging Face Models

Highlighted Details

  • Native Rust implementation of Whisper.
  • Supports multiple Burn backends (tch, wgpu).
  • Includes scripts for converting Hugging Face models to Burn format.
  • Model files available on Hugging Face.

Maintenance & Community

The project is maintained by Gadersd. Community channels are not explicitly mentioned in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The wgpu backend is noted as potentially unstable for large models. Conversion scripts require tinygrad installed from source, which may add complexity to the setup.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.