ASR_Theory by zw76859420

Speech recognition theory and practice

Created 7 years ago

619 stars

Top 52.5% on SourcePulse

Project Summary

This repository provides a summary of Automatic Speech Recognition (ASR) theory, including papers, presentations, and personal insights. It is intended for researchers and practitioners interested in the theoretical underpinnings of ASR systems. The project aims to consolidate learning and share valuable resources for the ASR community.

How It Works

The repository categorizes resources into theoretical and practical sections. It highlights key papers, offers personal commentary, and includes presentations (PPTs) detailing the construction of GMM-HMM and NN-HMM acoustic models using the Kaldi toolkit. The project also references personal GitHub projects for syllable, word, and phone-based ASR model implementation.

Quick Start & Requirements

No specific installation or quick start commands are provided. Prerequisites would include familiarity with ASR concepts and potentially the Kaldi toolkit for practical implementation.

Highlighted Details

Includes a PPT from Google's INTERSPEECH presentation, noted for its quality.
Summarizes recent deep learning networks with links to related personal GitHub projects (ASR_Syllable, ASR_WORD, ASR_Phone).
Focuses on acoustic modeling units such as syllables, words, and phones.

Maintenance & Community

The repository states it will no longer be actively maintained. Users are directed to the "元语音" (Meta-Speech) website and WeChat group for ongoing research and discussion.

Licensing & Compatibility

The license is not specified in the provided text.

Limitations & Caveats

The repository is no longer actively maintained, and users are encouraged to seek information from external resources. The practical implementation details and code are referenced via separate GitHub projects.

ASR_Theory by zw76859420

Explore Similar Projects

parakeet.cpp by Frikallo

ASR-TTS-paper-daily by halsay

OSUM by ASLP-lab

speech-recognition-papers by wenet-e2e

open-audio-opd by AutoArk

vakyansh-models by Open-Speech-EkStep

WenetSpeech by wenet-e2e

SenseVoice.cpp by lovemefan

attention-lvcsr by rizar

Fun-ASR by FunAudioLLM

ASR-LLM-TTS by ABexit

FunASR by modelscope