Discover and explore top open-source AI tools and projects—updated daily.
facebookresearchMultilingual speech recognition for over 1600 languages
Top 18.1% on SourcePulse
Omnilingual ASR is an open-source speech recognition system designed for broad accessibility, supporting over 1,600 languages, including hundreds previously uncovered by any ASR technology. It aims to make speech technology more inclusive and adaptable for communities and researchers worldwide by enabling new languages to be added with minimal data through scalable zero-shot learning.
How It Works
The system employs a flexible model family combining Wave2Vec (W2V), Connectionist Temporal Classification (CTC), and Large Language Model (LLM) architectures. Its core innovation lies in scalable zero-shot learning, allowing rapid adaptation to new languages using only a few paired examples, thereby circumventing the need for extensive, specialized datasets. This approach enhances inclusivity and adaptability for diverse linguistic communities.
Quick Start & Requirements
pip install omnilingual-asr or uv add omnilingual-asr.libsndfile is required for audio support (e.g., brew install libsndfile on macOS).facebook/omnilingual-asr-corpus), Paper, Blogpost, Documentation, Quick Start, Inference Guide.Highlighted Details
Maintenance & Community
The project is attributed to the "Omnilingual ASR Team" with numerous listed authors. Specific community channels (e.g., Discord, Slack) or explicit roadmap links are not detailed in the provided README.
Licensing & Compatibility
The code and models are released under the Apache 2.0 license, which generally permits commercial use and integration into closed-source projects.
Limitations & Caveats
Currently, the inference pipeline only accepts audio files shorter than 40 seconds. Support for transcribing unlimited-length audio files is planned for a future release.
1 week ago
Inactive
metavoiceio
openai