RapidASR by RapidAI

Open-source library for automatic speech recognition

Created 4 years ago

597 stars

Top 54.6% on SourcePulse

Project Summary

RapidASR is a cross-platform, commercial-grade open-source library for Automatic Speech Recognition (ASR) inference, targeting developers needing easy-to-use, offline Chinese and English speech recognition. It simplifies ASR model integration by leveraging ONNXRuntime and the FunASR framework, offering a unified API for various ASR models.

How It Works

The system processes audio input through a pipeline that includes optional Voice Activity Detection (VAD) to segment speech, followed by the core ASR inference using ONNXRuntime. It then applies punctuation restoration (RapidPunc) to refine the recognized text. This modular approach, powered by ONNXRuntime, ensures efficient, cross-platform inference and allows for easy integration of different ASR models, such as the Paraformer model from Alibaba Damo Academy.

Quick Start & Requirements

Install: pip install rapid_paraformer
Prerequisites: Python 3.7+, ONNXRuntime, librosa.
Resources: Requires downloading pre-trained models and configuration files.
Docs: https://github.com/RapidAI/RapidASR

Highlighted Details

Supports both Python and C++ inference engines.
Offers batch inference capabilities.
Handles various input audio formats and types (file paths, NumPy arrays).
Includes a punctuation restoration module (RapidPunc).

Maintenance & Community

Core code integrated into FunASR.
QQ Group: 645751008
Project still receives updates.

Licensing & Compatibility

License details are not explicitly stated in the README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project's core code has been merged into FunASR, suggesting potential future maintenance focus may shift. Specific licensing terms for commercial use are not detailed, which may require further investigation for enterprise adoption.

RapidASR by RapidAI

Explore Similar Projects

onnx-asr by istupakov

orate by haydenbleasel

reverb by revdotcom

SenseVoice.cpp by lovemefan

parrots by shibing624

TensorflowASR by Z-yq

athena by athena-team

SenseVoice by FunAudioLLM

wenet by wenet-e2e

sherpa-onnx by k2-fsa

FunASR by modelscope

PaddleSpeech by PaddlePaddle