Discover and explore top open-source AI tools and projects—updated daily.
PyTorch ASR framework for streaming and non-streaming speech recognition
Top 48.5% on SourcePulse
MASR (Magical Automatic Speech Recognition) is a PyTorch-based framework for both streaming and non-streaming automatic speech recognition (ASR). It supports various models like Conformer and DeepSpeech2, multiple decoding methods, and extensive data augmentation, making it a versatile tool for researchers and developers working with speech recognition tasks. The framework aims for simplicity and practicality, with deployment options for servers and Nvidia Jetson devices.
How It Works
MASR leverages PyTorch for its core implementation, offering flexibility in model architecture and training. It supports multiple pre-processing techniques (e.g., fbank, mfcc) and a variety of data augmentation methods to improve model robustness. The framework's key advantage lies in its unified support for both streaming and non-streaming ASR through a simple configuration parameter, along with diverse decoding strategies like CTC greedy search, CTC prefix beam search, and attention rescoring.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 months ago
Inactive