FastASR by chenkui164

C++ ASR inference project for ARM platforms

Created 3 years ago

543 stars

Top 58.6% on SourcePulse

Project Summary

FastASR is a C++-based Automatic Speech Recognition (ASR) inference engine designed for high performance and minimal dependencies, targeting developers and researchers needing efficient ASR on various platforms, including ARM devices like the Raspberry Pi 4B. It offers near-commercial-grade accuracy by leveraging optimized Transformer models trained on extensive datasets, providing a fast and accurate solution for speech-to-text tasks.

How It Works

This project implements ASR inference purely in C++, eschewing deep learning framework dependencies like PyTorch or TensorFlow. This approach allows for significant CPU optimization tailored to specific architectures, leading to high execution efficiency. By minimizing data copying and utilizing pointer-heavy algorithms, FastASR achieves faster inference speeds compared to framework-based solutions, particularly on resource-constrained devices. It supports both non-streaming and streaming models, with VAD technology enabling long audio processing for non-streaming variants.

Quick Start & Requirements

Install: pip install fastasr for Python users. Source compilation is also supported for C++ integration and custom builds.
Prerequisites: CPython 3.6-3.11, libfftw3, libopenblas. For Raspberry Pi 4B optimization, a 64-bit OS and recompilation of dependencies are recommended. Pre-trained models must be downloaded separately.
Setup: Installation via pip is straightforward. Compiling from source and downloading models may take longer depending on system resources and network speed.
Links: Example Usage

Highlighted Details

Supports four models: Paraformer, k2_rnnt2, conformer, and conformer_online (streaming).
Achieves real-time performance on ARM platforms like Raspberry Pi 4B.
Offers both C++ static library (libfastasr.a) and Python module (PyFastASR) interfaces.
Models are trained on WenetSpeech (10000+ hours) and private Alibaba datasets (60000+ hours).

Maintenance & Community

The project appears to be actively developed by chenkui164. Further community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is still working on model quantization and compression. Punctuation addition requires a separate NLP model. The README notes that some models can be large and slow, potentially impacting client-side performance, though the C++ implementation aims to mitigate this.

FastASR by chenkui164

Explore Similar Projects

Squeezeformer by kssteven418

qwen.cpp by QwenLM

RapidASR by RapidAI

CAT by thu-spmi

TensorflowASR by Z-yq

sherpa by k2-fsa

STT by coqui-ai

ailia-models by ailia-ai

transformers.js by huggingface

wenet by wenet-e2e

PaddleSpeech by PaddlePaddle

whisper.cpp by ggml-org