useful-transformers by moonshine-ai

Library for efficient Transformer model inference on edge devices

Created 2 years ago

475 stars

Top 64.2% on SourcePulse

Project Summary

This library provides efficient inference for Transformer models, specifically targeting low-cost, low-energy edge processors. It aims to enable high-speed speech-to-text transcription using OpenAI's Whisper model on devices like RK3588-based single-board computers, offering significant speedups over existing implementations.

How It Works

The core innovation lies in leveraging the NPU (Neural Processing Unit) available on RK3588 processors for FP16 matrix multiplication. This approach significantly accelerates the large matrix operations within the Transformer encoder, which are critical for performance. The library's initial focus is on optimizing the Whisper model, particularly the tiny.en variant.

Quick Start & Requirements

Install via wheel package: python -m pip install https://github.com/usefulsensors/useful-transformers/releases/download/0.1_rk3588/useful_transformers-0.1-cp310-cp310-linux_aarch64.whl
Requires RK3588 processor and Linux aarch64 environment.
Example transcription: taskset -c 4-7 python -m useful_transformers.transcribe_wav <wav_file>
See GitHub Releases for the wheel package.

Highlighted Details

Achieves 30x real-time transcription speeds for Whisper tiny.en.
Demonstrates 2x speed improvement over faster-whisper's int8 implementation.
Utilizes FP16 matrix multiplication on the RK3588 NPU for performance gains.

Maintenance & Community

Active contributors include Nat Jeffries, Manjunath Kudlur, Guy Nicholson, James Wang, Pete Warden, and Ali Zartash.
TODO list indicates plans for larger Whisper models, int8/int4 matmuls, and asynchronous kernel launches.

Licensing & Compatibility

The license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore unclear.

Limitations & Caveats

The current implementation is limited to the tiny.en and base.en Whisper models, with larger models yet to be supported. Further optimizations are planned, including int8/int4 matmuls and asynchronous kernel launches, suggesting the library is still under active development.

useful-transformers by moonshine-ai

Explore Similar Projects

Lightning-SimulWhisper by altalt-org

whisper-at by YuanGongND

GLM-ASR by zai-org

VITA-Audio by VITA-MLLM

WhisperS2T by shashikg

10x by 0xCrunchyy

TensorflowASR by Z-yq

whisper-ctranslate2 by Softcatala

WhisperFusion by collabora

moonshine by moonshine-ai

insanely-fast-whisper by Vaibhavs10

faster-whisper by SYSTRAN