wer_are_we by syhw

Tracking speech recognition state-of-the-art

Created 11 years ago

1,863 stars

Top 22.5% on SourcePulse

View on GitHub

6 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Tristan Hume

MTS at Anthropic

Hanlin Tang

CTO Neural Networks at Databricks; Cofounder of MosaicML

David Cournapeau

Author of scikit-learn

and 2 more!

Project Summary

This repository serves as a curated bibliography of state-of-the-art results in automatic speech recognition (ASR). It tracks Word Error Rates (WER) across various datasets and benchmarks, providing a valuable resource for researchers and practitioners to understand the evolving landscape of ASR performance and identify leading methodologies.

How It Works

The project compiles and presents WER data from published research papers, organized by benchmark datasets such as LibriSpeech, WSJ, Hub5'00, Fisher, TED-LIUM, CHiME6, and TIMIT. It details the specific models, training data, and augmentation techniques employed in each cited work, allowing for a comparative analysis of different ASR approaches.

Quick Start & Requirements

This repository is a reference and does not require installation or execution. Users can directly access and utilize the information presented in the README.

Highlighted Details

Comprehensive tracking of WER across multiple standard ASR benchmarks.
Detailed information on model architectures (e.g., Conformer, HuBERT, Deep Speech 2), training strategies, and data augmentation techniques.
Includes human performance benchmarks for context.
Covers a wide range of ASR research from 2009 to the present.

Maintenance & Community

The repository is maintained by "syhw" and is open to community contributions for corrections and additions, as indicated by the "Feel free to correct!" invitation.

Licensing & Compatibility

The licensing information is not explicitly stated in the provided README.

Limitations & Caveats

The README does not provide direct links to the cited papers or code implementations, requiring users to search for them independently. The data is presented as a bibliography and does not include executable code or pre-trained models. Some entries are marked with "TODO," indicating incomplete information or areas for future expansion.

Health Check

Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days