Python package for Kaldi speech recognition with dynamic grammars
Top 81.9% on sourcepulse
This project provides a Python package for dynamic, context-aware speech recognition using the Kaldi engine, enabling granular control over active grammars during decoding. It targets developers building command-and-control applications, offering improved accuracy by reducing the search space and allowing shared dictation models.
How It Works
Kaldi Active Grammar (KaldiAG) extends Kaldi's decoding graph capabilities by allowing multiple grammars to be compiled separately and dynamically activated or deactivated on a per-utterance basis. This approach contrasts with traditional monolithic Kaldi graphs, enabling more efficient and accurate recognition by only considering relevant grammars for the current context. It integrates with the Dragonfly speech recognition framework, facilitating the definition of grammars and associated actions.
Quick Start & Requirements
pip install 'dragonfly2[kaldi]'
(recommended for Dragonfly integration) or pip install kaldi-active-grammar
(for direct use).pip install 'kaldi-active-grammar[g2p_en]'
or pip install 'kaldi-active-grammar[online]'
. Windows users may need to install the VC2017+ redistributable.kaldi-dragonfly-winpython
) is available for quick setup.Highlighted Details
Maintenance & Community
The project is developed by David Zurow (@daanzu). Donations are appreciated to support development. Related repositories and a Docker image for Linux are listed.
Licensing & Compatibility
Licensed under the GNU Affero General Public License v3 (AGPL-3.0-or-later). This is a strong copyleft license that may have implications for commercial or closed-source use. The project incorporates code from Kaldi ASR (Apache-2.0) and OpenFST (Apache-2.0).
Limitations & Caveats
The formal documentation is currently limited, with example usage primarily found within the README. The project relies on a specific fork of Kaldi, not intended for standalone use. Conversion of standard Kaldi models is not yet fully implemented.
2 years ago
1 day