speech-recognition-papers  by wenet-e2e

Collection of speech recognition research papers

created 4 years ago
327 stars

Top 84.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated list of research papers focusing on cutting-edge, industrial-grade end-to-end speech recognition techniques. It targets researchers and engineers in the ASR field, providing a structured overview of advancements in streaming, non-autoregressive, and on-device ASR, as well as related areas like rescoring and self-supervised learning.

How It Works

The repository categorizes papers by key ASR research directions, such as Streaming ASR (RNA, RNN-T, Attention-based), Non-Autoregressive (NAR) ASR, ASR Rescoring, On-device ASR, and Self-Supervised Learning (SSL). Within each category, it lists influential papers, often highlighting specific architectural variations or training methodologies (e.g., Conformer-equipped RNN-T, Mask CTC, wav2vec 2.0). This structured approach allows users to quickly identify and explore relevant state-of-the-art approaches.

Highlighted Details

  • Comprehensive coverage of Streaming ASR, detailing various architectures like RNA, RNN-T, and Attention-based models, including Transformer and Conformer variants.
  • Extensive listing of Non-Autoregressive (NAR) ASR techniques, such as MASK-Predict, Imputer, and Insertion-based methods, with a focus on recent advancements.
  • Inclusion of papers on ASR Rescoring/Spelling Correction, On-device ASR, and Self-Supervised Learning (SSL) methods like APC and CPC.
  • Categorization of unified streaming/non-streaming models and multi-speaker ASR approaches.

Maintenance and Community

This is a community-driven list, with an open invitation for pull requests to add new papers or corrections. Specific contributors or maintainers are not highlighted in the README.

Licensing and Compatibility

The repository itself does not contain code, only a list of papers. Therefore, no specific software license or compatibility restrictions apply to the repository's content.

Limitations and Caveats

This repository is a reference list of papers and does not provide implementations, code, or benchmarks. Users must consult the individual papers for technical details, code availability, and performance evaluations.

Health Check
Last commit

3 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.