Discover and explore top open-source AI tools and projects—updated daily.
FireRedTeamOpen-source ASR models for Mandarin, dialects, and English
Top 24.6% on SourcePulse
FireRedASR provides open-source, industrial-grade Automatic Speech Recognition (ASR) models for Mandarin, Chinese dialects, and English. It offers two variants: FireRedASR-LLM for state-of-the-art performance leveraging LLMs, and FireRedASR-AED for a balance of performance and efficiency. The project targets researchers and developers needing high-accuracy speech-to-text capabilities, particularly for Mandarin, with demonstrated SOTA results on public benchmarks.
How It Works
FireRedASR features two architectures: FireRedASR-LLM uses an Encoder-Adapter-LLM framework to integrate large language models for enhanced speech interaction. FireRedASR-AED employs an Attention-based Encoder-Decoder (AED) architecture, optimized for efficiency and serving as a robust speech representation module. This dual approach allows users to select models based on performance or efficiency requirements.
Quick Start & Requirements
pip install -r requirements.txt within a Python 3.10 Conda environment.ffmpeg.speech2text.py) are provided for inference. Python API usage is also demonstrated.Highlighted Details
Maintenance & Community
The project is actively developed, with recent releases in early 2025. Key contributors are listed as Kai-Tuo Xu, Feng-Long Xie, Xu Tang, and Yao Hu. Further community engagement channels are not explicitly mentioned in the README.
Licensing & Compatibility
The project is released under an unspecified license. The README mentions dependencies on other open-source works like Qwen2-7B-Instruct, WeNet, and Speech-Transformer, which may have their own licensing terms that could affect commercial use or closed-source linking.
Limitations & Caveats
FireRedASR-AED supports audio inputs up to 60s; longer inputs may cause issues. FireRedASR-LLM supports up to 30s, with behavior for longer inputs unknown. Batch beam search with FireRedASR-LLM may require similar utterance lengths to avoid repetition.
3 weeks ago
Inactive
janhq
canopyai