Discover and explore top open-source AI tools and projects—updated daily.
Open-Speech-EkStepOpen-source speech models for Indic languages
Top 86.9% on SourcePulse
This repository provides a comprehensive suite of open-source speech processing models, primarily focused on Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) for Indic languages. It targets researchers and developers working with low-resource languages, offering pretrained and fine-tuned models, language models, and ancillary tools like punctuation and gender classification.
How It Works
The project leverages state-of-the-art architectures like Conformer and wav2vec2, trained on extensive datasets. Pretrained models, such as Vakyansh-Conformer-SSL (34,000 hours across 39 Indian languages) and CLSRIL-23 (10,000 hours across 23 Indic languages), serve as strong foundations. These are further fine-tuned for specific languages, with models like hindi_large_ssl_2500 and kannada_large_ssl_1000 demonstrating this approach. Language models, built using kenlm 5-gram, are provided to enhance ASR accuracy. TTS models utilize a Glow TTS and hifi GAN combination.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 years ago
Inactive
kensho-technologies
facebookresearch