Discover and explore top open-source AI tools and projects—updated daily.
Speech model for diverse dialects
Top 45.6% on SourcePulse
This repository provides the TeleSpeech-ASR large model, an automatic speech recognition system capable of understanding over 30 Chinese dialects. It is designed for researchers and developers working with diverse Chinese dialects, offering pre-trained models and fine-tuning frameworks to achieve high accuracy with limited labeled data.
How It Works
The project leverages a self-supervised pre-training approach on 300,000 hours of unlabeled multi-dialectal speech data. This is followed by fine-tuning on 30 types of labeled dialect data. The core advantage lies in its ability to break the limitation of single-dialect models, enabling a unified model to comprehend a wide range of dialects. Users can either fine-tune the pre-trained models using frameworks like Fairseq or use them as feature extractors with Wenet for downstream ASR tasks.
Quick Start & Requirements
pip install --editable ./
within the cloned Fairseq directory) and other dependencies (pip install -r requirements.txt
or specific packages like kaldiio
, timm
, editdistance
, soundfile
). Kaldi is required for feature extraction unless using kaldi_io.py
..tsv
) are required for training and inference.Highlighted Details
pretrain_large
), WenetSpeech (13.0% with pretrain_large
), Babel (19.1% with pretrain_large
), and KeSpeech (8.1% with pretrain_large
).Maintenance & Community
Licensing & Compatibility
tele_ai@chinatelecom.cn
), granting a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable commercial license.Limitations & Caveats
The project statement strongly advises against using the TeleSpeech models for any activities that harm national social security or are illegal, and requires security review and filing for internet services. The authors disclaim responsibility for any issues arising from data security, public opinion risks, or misuse of the model, despite efforts to ensure data compliance. Unsupervised pre-trained models (pretrain_base
, pretrain_large
) require supervised training before direct inference.
1 year ago
Inactive