awesome-speech-enhancement  by WenzheLiu-Speech

Collection of resources for speech enhancement, separation, and sound source localization

created 5 years ago
1,159 stars

Top 34.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated list of papers, code, and tools for speech enhancement, speech separation, and sound source localization. It serves as a comprehensive resource for researchers and practitioners in the field of audio signal processing, offering a structured overview of state-of-the-art techniques and implementations.

How It Works

The repository categorizes resources by technique and application area, including spectral masking, complex domain processing, time-domain methods, generative models (GANs, VAEs, Diffusion), and hybrid approaches. It also covers dereverberation, single-channel separation, array signal processing, and relevant tools and books. This structured approach allows users to quickly find relevant research and code for specific speech processing tasks.

Highlighted Details

  • Extensive coverage of deep learning architectures like RNNs, CNNs, Transformers, and U-Nets for various speech enhancement and separation tasks.
  • Includes implementations for both traditional signal processing methods and modern generative models.
  • Features links to code repositories, papers, and datasets for numerous research projects.
  • Covers challenges like DNS Challenge and provides resources for data collection and evaluation.

Maintenance & Community

This is a community-driven "awesome" list, with contributions welcomed via pull requests. It appears to be actively maintained by its curator, WenzheLiu-Speech, and the broader open-source community.

Licensing & Compatibility

The repository itself is a curated list and does not have a specific license. However, the linked code repositories will have their own individual licenses, which users must consult for usage and compatibility, especially for commercial applications.

Limitations & Caveats

As a curated list, the repository does not provide direct implementations but rather links to external projects. The quality, maintenance status, and licensing of these linked projects vary, requiring individual assessment by the user.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
47 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind) and Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers).

audio-ai-timeline by archinetai

0%
2k
AI model timeline for audio generation
created 2 years ago
updated 1 year ago
Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

awesome-diarization by wq2012

0.2%
2k
List of resources for speaker diarization
created 6 years ago
updated 1 week ago
Feedback? Help us improve.