Discover and explore top open-source AI tools and projects—updated daily.
wenet-e2eASR toolkit for production-ready end-to-end speech recognition
Top 10.2% on SourcePulse
WeNet is a production-ready, end-to-end speech recognition toolkit designed for both streaming and non-streaming applications. It offers a full-stack solution for ASR development, targeting researchers and engineers who need accurate, lightweight, and well-documented tools for building and deploying speech recognition systems.
How It Works
WeNet integrates both Transformer and Conformer models, leveraging a hybrid approach that combines the strengths of different architectures for state-of-the-art accuracy. It supports WFST-based decoding for seamless Language Model integration and offers efficient runtime solutions for deployment.
Quick Start & Requirements
pip install git+https://github.com/wenet-e2e/wenet.gitconda create -n wenet python=3.10), install sox (conda install conda-forge::sox), PyTorch (pip install torch==2.2.2+cu121 torchaudio==2.2.2+cu121 -f https://download.pytorch.org/whl/torch_stable.html), and other dependencies (pip install -r requirements.txt).sox and libsox-dev (Ubuntu/CentOS). Ascend NPU support requires CANN toolkit.cmake 3.14+.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The runtime build for x86 or LM integration requires manual compilation steps. Specific hardware acceleration (e.g., Ascend NPU) necessitates separate installation of vendor-specific toolkits and kernel drivers.
1 day ago
1 day
PaddlePaddle