awesome-speech-recognition-speech-synthesis-papers  by zzw922cn

Curated list of speech and audio AI research papers

created 8 years ago
3,056 stars

Top 16.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository is a curated list of academic papers covering speech and audio processing, specifically Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis (TTS), Voice Conversion (VC), Language Modeling, Singing Voice Synthesis (SVS), and Music Modeling. It serves as a valuable resource for researchers, engineers, and students in the field looking to stay updated on foundational and state-of-the-art techniques.

How It Works

The repository functions as a comprehensive bibliography, categorizing influential and recent research papers within specific sub-domains of speech and audio processing. Each entry typically includes the paper title, authors, year, and a direct link to the PDF. This structured approach allows users to quickly navigate and discover relevant literature.

Quick Start & Requirements

This is a curated list of papers; there are no installation or execution requirements. Users can directly access the papers via the provided PDF links.

Highlighted Details

  • Extensive coverage of seminal works in ASR, dating back to the early 1980s, including foundational papers on Hidden Markov Models (HMMs) and early neural network approaches.
  • Detailed listings of advancements in end-to-end ASR models, such as Connectionist Temporal Classification (CTC), attention-based models, and Transformer-based architectures like Conformer.
  • Comprehensive collection of papers on Speech Synthesis, spanning from traditional parametric methods to modern neural vocoders (WaveNet, MelGAN, HiFi-GAN) and diffusion-based models.
  • Significant focus on Voice Conversion techniques, including non-parallel methods, zero-shot learning, and style transfer.
  • Inclusion of papers on emerging areas like Text-to-Audio and Music Generation.

Maintenance & Community

The repository is maintained by zzw922cn. Further community engagement or roadmap details are not specified in the README.

Licensing & Compatibility

The repository itself is a list of links to academic papers. The licensing of the individual papers would be governed by their respective publishers or authors. Compatibility for commercial use depends on the licenses of the linked papers.

Limitations & Caveats

This is a static list of papers and does not provide code, datasets, or implementations. The "interesting papers" section is subjective and may not be exhaustive.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
30 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

nlp-library by mihail911

0%
1k
NLP papers for practitioners
created 8 years ago
updated 5 years ago
Feedback? Help us improve.