wenet by wenet-e2e

ASR toolkit for production-ready end-to-end speech recognition

Created 5 years ago

4,990 stars

Top 9.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Binyuan Hui

Research Scientist at Alibaba Qwen

Benjamin Bolte

Cofounder of K-Scale Labs

Project Summary

WeNet is a production-ready, end-to-end speech recognition toolkit designed for both streaming and non-streaming applications. It offers a full-stack solution for ASR development, targeting researchers and engineers who need accurate, lightweight, and well-documented tools for building and deploying speech recognition systems.

How It Works

WeNet integrates both Transformer and Conformer models, leveraging a hybrid approach that combines the strengths of different architectures for state-of-the-art accuracy. It supports WFST-based decoding for seamless Language Model integration and offers efficient runtime solutions for deployment.

Quick Start & Requirements

Install (runtime only): pip install git+https://github.com/wenet-e2e/wenet.git
Install (training/deployment): Clone repo, create Conda env (conda create -n wenet python=3.10), install sox (conda install conda-forge::sox), PyTorch (pip install torch==2.2.2+cu121 torchaudio==2.2.2+cu121 -f https://download.pytorch.org/whl/torch_stable.html), and other dependencies (pip install -r requirements.txt).
Prerequisites: CUDA 12.1 recommended, Python 3.10, sox and libsox-dev (Ubuntu/CentOS). Ascend NPU support requires CANN toolkit.
Runtime build: Requires cmake 3.14+.
Docs: Roadmap, Docs

Highlighted Details

Achieves SOTA results on public speech datasets.
Production-first design with full-stack solutions.
Supports both streaming and non-streaming ASR.
Well-documented with Python and command-line usage examples.

Maintenance & Community

Active development with contributions from multiple authors.
Discussion primarily via GitHub Issues. WeChat group available for Chinese users.

Licensing & Compatibility

Licensed under Apache 2.0.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The runtime build for x86 or LM integration requires manual compilation steps. Specific hardware acceleration (e.g., Ascend NPU) necessitates separate installation of vendor-specific toolkits and kernel drivers.

Health Check

Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

50 stars in the last 30 days