Speech-related recipes for various datasets using k2-fsa and lhotse
Top 33.6% on sourcepulse
Icefall provides a comprehensive suite of recipes for training and deploying Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models. It targets researchers and engineers working with speech technologies, offering state-of-the-art models and extensive dataset support, enabling rapid prototyping and benchmarking.
How It Works
Icefall leverages the k2-fsa and lhotse libraries for efficient speech model training. It supports various architectures like TDNN, LSTM, Conformer, and Zipformer, combined with CTC and Transducer loss functions. The project emphasizes flexible deployment through frameworks like Sherpa, Sherpa-NCNN, and Sherpa-ONNX, facilitating integration into diverse applications.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 weeks ago
1 day