Discover and explore top open-source AI tools and projects—updated daily.
Speech recognition toolkit bridging PyTorch and Kaldi
Top 19.2% on SourcePulse
This project provides a toolkit for developing state-of-the-art hybrid DNN/HMM speech recognition systems, integrating PyTorch for deep learning components and Kaldi for feature extraction, label computation, and decoding. It's designed for researchers and engineers working on Automatic Speech Recognition (ASR) to build flexible and efficient systems.
How It Works
The toolkit bridges PyTorch and Kaldi, leveraging PyTorch's flexibility for neural network development and Kaldi's efficiency in traditional speech processing tasks. It allows for easy integration of custom acoustic models and offers several pre-implemented neural network architectures (MLP, CNN, RNN, LSTM, GRU, Li-GRU, SincNet). The system is configured via INI files, enabling complex architectures that combine multiple features and label streams, and supports multi-GPU training and distributed computing.
Quick Start & Requirements
pip install -r requirements.txt
.Highlighted Details
Maintenance & Community
The project encourages community contributions and feedback for future development, aiming to support a wider range of speech processing tasks. The README mentions a successor project, SpeechBrain, which is recommended for new development.
Licensing & Compatibility
Released under a Creative Commons Attribution 4.0 International license, allowing for copy, distribution, and modification for research, commercial, and non-commercial purposes, provided the original paper is cited.
Limitations & Caveats
The project is actively developing, with a successor project (SpeechBrain) recommended for new work. While it supports various models and features, the README indicates plans for further extensions to cover more speech-related tasks. Compatibility with very old PyTorch versions is not guaranteed.
3 years ago
Inactive