Leaderboard by SpeechColab

ASR benchmarking platform

Created 5 years ago

539 stars

Top 59.0% on SourcePulse

Project Summary

This project provides a comprehensive benchmarking platform for Automatic Speech Recognition (ASR) systems, targeting researchers and developers in the ASR field. It aims to foster improvement in ASR technology by enabling easy comparison, reproduction, and examination of different models across a wide array of speech tasks and scenarios.

How It Works

The platform comprises three core components: a TestSet Zoo with diverse academic and custom-curated datasets (including Chinese datasets with detailed scenario and topic information), a Model Zoo featuring both commercial cloud APIs and popular open-source local models, and a standardized Benchmarking Pipeline for data preparation, recognition, post-processing, and error rate evaluation. This integrated approach ensures consistent and reproducible benchmarking.

Quick Start & Requirements

Installation and usage details are not explicitly provided in the README.
The project requires access to various ASR models (cloud APIs or local model files) and the specified datasets.
Specific hardware or software prerequisites are not detailed.

Highlighted Details

Extensive TestSet Zoo includes 26 curated Chinese datasets covering diverse scenarios (e.g., news, interviews, live broadcasts, lectures, podcasts) and topics, alongside popular English datasets like LibriSpeech and GigaSpeech.
Model Zoo lists numerous English and Chinese ASR models, including major cloud providers (Alibaba, Amazon, Google, Microsoft, Tencent) and local models (Coqui, DeepSpeech, NeMo, Vosk, Whisper, Paraformer).
Detailed benchmarking results (Character Error Rate - CER) are presented for various models across different test sets, allowing for direct comparison.
The platform supports both "unlocked" (publicly available) and "locked" (restricted access) test sets, offering a broader evaluation scope.

Maintenance & Community

Contact email: leaderboard@speechio.ai.
No specific community channels (Discord, Slack) or roadmap information are provided in the README.

Licensing & Compatibility

The README does not specify a license for the project code or the datasets.
Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The README lacks explicit instructions for setup, installation, or running the benchmarking pipeline.
Information regarding project maintenance, contribution guidelines, or community engagement is absent.
The availability status of some listed models (e.g., bilibili_api_zh(*)) is marked as not universally available to the public yet.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

3 stars in the last 30 days

Explore Similar Projects

AudioBench by AudioLLMs

A universal benchmark for evaluating audio large language models

Created 1 year ago

Updated 8 months ago

UltraEval-Audio by OpenBMB

Unified framework for comprehensive audio foundation model evaluation

Created 1 year ago

Updated 3 weeks ago

VoiceBench by MatthewCYM

Benchmark for LLM-based voice assistants

Created 1 year ago

Updated 3 weeks ago

ltu by YuanGongND

Audio/speech LLM for perception and understanding, supporting open-ended questions

Created 2 years ago

Updated 1 year ago

speech-to-text-benchmark by Picovoice

STT benchmark framework for comparing speech-to-text engines

Created 7 years ago

Updated 1 month ago

athena by athena-team

Open-source speech processing engine for industrial/academic use

Created 6 years ago

Updated 3 years ago

Whisper-Finetune by yeyupiaoling

Whisper finetuning and inference toolkit

Created 2 years ago

Updated 2 months ago

Starred by

Lilian Weng

Lilian Weng(Cofounder of Thinking Machines Lab) and

Travis Fischer

Travis Fischer(Founder of Agentic).

lmms-eval by EvolvingLMMs-Lab

LMM evaluation toolkit for text, image, video, and audio tasks

Created 2 years ago

Updated 1 day ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Malte Pietsch

Malte Pietsch(Cofounder of deepset), and

1 more.

inference by mlcommons

MLPerf Inference benchmark suite

Created 7 years ago

Updated 1 day ago

Starred by

Binyuan Hui

Binyuan Hui(Research Scientist at Alibaba Qwen) and

Benjamin Bolte

Benjamin Bolte(Cofounder of K-Scale Labs).

wenet by wenet-e2e

ASR toolkit for production-ready end-to-end speech recognition

Created 5 years ago

Updated 2 months ago

FunASR by modelscope

Speech recognition toolkit for bridging research and industrial applications

Created 3 years ago

Updated 3 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Piotr Dąbkowski

Piotr Dąbkowski(Cofounder of ElevenLabs), and

2 more.

PaddleSpeech by PaddlePaddle

Speech toolkit for ASR, TTS, speaker verification, translation, and keyword spotting

Created 8 years ago

Updated 2 weeks ago

Feedback? Help us improve.