awesome-speech-recognition-speech-synthesis-papers by zzw922cn

Curated list of speech and audio AI research papers

Created 9 years ago

3,127 stars

Top 14.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

David Cournapeau

Author of scikit-learn

Project Summary

This repository is a curated list of academic papers covering speech and audio processing, specifically Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis (TTS), Voice Conversion (VC), Language Modeling, Singing Voice Synthesis (SVS), and Music Modeling. It serves as a valuable resource for researchers, engineers, and students in the field looking to stay updated on foundational and state-of-the-art techniques.

How It Works

The repository functions as a comprehensive bibliography, categorizing influential and recent research papers within specific sub-domains of speech and audio processing. Each entry typically includes the paper title, authors, year, and a direct link to the PDF. This structured approach allows users to quickly navigate and discover relevant literature.

Quick Start & Requirements

This is a curated list of papers; there are no installation or execution requirements. Users can directly access the papers via the provided PDF links.

Highlighted Details

Extensive coverage of seminal works in ASR, dating back to the early 1980s, including foundational papers on Hidden Markov Models (HMMs) and early neural network approaches.
Detailed listings of advancements in end-to-end ASR models, such as Connectionist Temporal Classification (CTC), attention-based models, and Transformer-based architectures like Conformer.
Comprehensive collection of papers on Speech Synthesis, spanning from traditional parametric methods to modern neural vocoders (WaveNet, MelGAN, HiFi-GAN) and diffusion-based models.
Significant focus on Voice Conversion techniques, including non-parallel methods, zero-shot learning, and style transfer.
Inclusion of papers on emerging areas like Text-to-Audio and Music Generation.

Maintenance & Community

The repository is maintained by zzw922cn. Further community engagement or roadmap details are not specified in the README.

Licensing & Compatibility

The repository itself is a list of links to academic papers. The licensing of the individual papers would be governed by their respective publishers or authors. Compatibility for commercial use depends on the licenses of the linked papers.

Limitations & Caveats

This is a static list of papers and does not provide code, datasets, or implementations. The "interesting papers" section is subjective and may not be exhaustive.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days