Discover and explore top open-source AI tools and projects—updated daily.
Library for efficient Transformer model inference on edge devices
Top 65.9% on SourcePulse
This library provides efficient inference for Transformer models, specifically targeting low-cost, low-energy edge processors. It aims to enable high-speed speech-to-text transcription using OpenAI's Whisper model on devices like RK3588-based single-board computers, offering significant speedups over existing implementations.
How It Works
The core innovation lies in leveraging the NPU (Neural Processing Unit) available on RK3588 processors for FP16 matrix multiplication. This approach significantly accelerates the large matrix operations within the Transformer encoder, which are critical for performance. The library's initial focus is on optimizing the Whisper model, particularly the tiny.en
variant.
Quick Start & Requirements
python -m pip install https://github.com/usefulsensors/useful-transformers/releases/download/0.1_rk3588/useful_transformers-0.1-cp310-cp310-linux_aarch64.whl
taskset -c 4-7 python -m useful_transformers.transcribe_wav <wav_file>
Highlighted Details
tiny.en
.faster-whisper
's int8 implementation.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The current implementation is limited to the tiny.en
and base.en
Whisper models, with larger models yet to be supported. Further optimizations are planned, including int8/int4 matmuls and asynchronous kernel launches, suggesting the library is still under active development.
1 year ago
Inactive