Speech-Separation-Paper-Tutorial by JusperLee

Speech separation paper tutorial

Created 6 years ago

909 stars

Top 39.8% on SourcePulse

Project Summary

This repository serves as a comprehensive tutorial and resource hub for neural network-based speech separation, targeting researchers and engineers in audio processing. It provides an organized overview of papers, models, datasets, and evaluation metrics from 2016 to 2025, enabling users to quickly grasp the field's evolution and identify state-of-the-art approaches.

How It Works

The project curates and categorizes a vast collection of speech separation research, highlighting key trends such as the dominance of deterministic models (87%) and the prevalence of known-speaker scenarios (84%). It details various network architectures (Dual-path, Conv-TasNet, U-Net), learning methods (predictive, clustering, unsupervised), and separation strategies (mask vs. mapping), offering a structured understanding of the technical landscape.

Quick Start & Requirements

This repository is a curated collection of papers and resources, not a runnable codebase. To utilize specific models, users will need to refer to the linked papers and their respective code repositories.

Highlighted Details

Comprehensive timeline of speech separation models from 2016-2025.
Performance comparisons across popular datasets like WSJ0-2Mix, WHAM!, and LibriMix, including SI-SNRi, SDRi, and parameter counts.
Detailed breakdown of model categories: deterministic vs. generative, mask vs. mapping, and learning methods.
Extensive dataset descriptions (WSJ0-2Mix, WHAM!, LibriMix, WHAMR!, LRS2-2Mix, SonicSet) with generation methods and requirements.

Maintenance & Community

The project is maintained by JusperLee and welcomes community contributions via pull requests.

Licensing & Compatibility

This repository is licensed under the MIT License, allowing for broad use and compatibility.

Limitations & Caveats

This repository is a curated list of papers and resources; it does not provide a unified, runnable framework for all listed models. Users must consult individual paper repositories for code and specific execution instructions.

Speech-Separation-Paper-Tutorial by JusperLee

Explore Similar Projects

Awesome-Speaker-Diarization by DongKeon

Ola by Ola-Omni

speech-recognition-uk by egorsmkv

ICASSP-2023-24-Papers by DmitryRyumin

Ming-UniAudio by inclusionAI

NBSS by Audio-WestlakeU

ai-audio-datasets by Yuan-ManX

fish-diffusion by fishaudio

SALMONN by bytedance

sherpa-onnx by k2-fsa

speechbrain by speechbrain

GPT-SoVITS by RVC-Boss