Discover and explore top open-source AI tools and projects—updated daily.
End-to-end attention-based speech recognition
Top 97.3% on SourcePulse
This project provides an end-to-end attention-based large vocabulary speech recognition system, serving as a reference implementation for associated research papers. It is targeted at researchers and practitioners in speech recognition who need to understand or reproduce the results of the cited work. The primary benefit is the availability of a working, albeit dated, implementation of an attention-based ASR system.
How It Works
The system utilizes an attention-based mechanism for speech recognition, allowing the model to focus on relevant parts of the input audio sequence when generating output tokens. This approach, detailed in the referenced papers, offers an alternative to traditional frame-synchronous models by directly mapping variable-length audio segments to variable-length text sequences.
Quick Start & Requirements
--shared
and --use-cuda=no
options, installing Python packages (pykwalify
, toposort
, pyyaml
, numpy
, pandas
, pyfst
), and installing kaldi-python
via python setup.py install
.exp/wsj
for specific instructions on replicating results with the WSJ dataset.Highlighted Details
Maintenance & Community
This codebase is no longer maintained and is based on outdated technologies (Theano, Blocks). Users are recommended to explore more modern implementations.
Licensing & Compatibility
Limitations & Caveats
The project explicitly states it is no longer maintained due to its reliance on outdated technologies like Theano and Blocks, recommending users seek more modern alternatives. This significantly limits its practical applicability for current ASR development.
2 years ago
Inactive