FastASR  by chenkui164

C++ ASR inference project for ARM platforms

created 3 years ago
531 stars

Top 60.4% on sourcepulse

GitHubView on GitHub
Project Summary

FastASR is a C++-based Automatic Speech Recognition (ASR) inference engine designed for high performance and minimal dependencies, targeting developers and researchers needing efficient ASR on various platforms, including ARM devices like the Raspberry Pi 4B. It offers near-commercial-grade accuracy by leveraging optimized Transformer models trained on extensive datasets, providing a fast and accurate solution for speech-to-text tasks.

How It Works

This project implements ASR inference purely in C++, eschewing deep learning framework dependencies like PyTorch or TensorFlow. This approach allows for significant CPU optimization tailored to specific architectures, leading to high execution efficiency. By minimizing data copying and utilizing pointer-heavy algorithms, FastASR achieves faster inference speeds compared to framework-based solutions, particularly on resource-constrained devices. It supports both non-streaming and streaming models, with VAD technology enabling long audio processing for non-streaming variants.

Quick Start & Requirements

  • Install: pip install fastasr for Python users. Source compilation is also supported for C++ integration and custom builds.
  • Prerequisites: CPython 3.6-3.11, libfftw3, libopenblas. For Raspberry Pi 4B optimization, a 64-bit OS and recompilation of dependencies are recommended. Pre-trained models must be downloaded separately.
  • Setup: Installation via pip is straightforward. Compiling from source and downloading models may take longer depending on system resources and network speed.
  • Links: Example Usage

Highlighted Details

  • Supports four models: Paraformer, k2_rnnt2, conformer, and conformer_online (streaming).
  • Achieves real-time performance on ARM platforms like Raspberry Pi 4B.
  • Offers both C++ static library (libfastasr.a) and Python module (PyFastASR) interfaces.
  • Models are trained on WenetSpeech (10000+ hours) and private Alibaba datasets (60000+ hours).

Maintenance & Community

The project appears to be actively developed by chenkui164. Further community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is still working on model quantization and compression. Punctuation addition requires a separate NLP model. The README notes that some models can be large and slow, potentially impacting client-side performance, though the C++ implementation aims to mitigate this.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

JittorLLMs by Jittor

0%
2k
Low-resource LLM inference library
created 2 years ago
updated 5 months ago
Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

ktransformers by kvcache-ai

0.4%
15k
Framework for LLM inference optimization experimentation
created 1 year ago
updated 22 hours ago
Feedback? Help us improve.