CTC beam search decoder for speech recognition
Top 67.6% on sourcepulse
This library provides a fast, Python-based CTC beam search decoder for speech recognition, targeting researchers and developers working with models like Nvidia's Conformer-CTC or Facebook's Wav2Vec2. It offers advanced features such as BPE vocabulary support, hotword boosting, and real-time decoding, aiming to match C++ implementation performance.
How It Works
The decoder implements CTC beam search in Python, leveraging optimizations like caching and beam pruning to achieve performance competitive with C++ implementations. It supports n-gram language models (e.g., KenLM) and integrates features like byte pair encoding (BPE) vocabulary handling and stateful language model decoding for real-time applications. This Python-centric approach facilitates rapid prototyping and experimentation with new features.
Quick Start & Requirements
pip install pyctcdecode
.arpa
or .bin
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README notes that default hyperparameter values are tuned for a specific use case, recommending users perform their own optimization for best results, especially with non-English languages.
2 years ago
Inactive