SEMamba  by RoyChao19477

Mamba-based speech enhancement models

Created 1 year ago
258 stars

Top 98.0% on SourcePulse

GitHubView on GitHub
Project Summary

SEMamba provides an official implementation for speech enhancement (SE) models based on the Mamba architecture, designed for universal, robust, and generalizable performance. It addresses diverse audio distortions and sampling frequencies with a single model, targeting researchers and engineers in audio signal processing. The project achieved 4th place in the URGENT challenge at IEEE SLT 2024.

How It Works

This project integrates the Mamba architecture into speech enhancement pipelines, aiming to create models capable of handling a wide spectrum of audio degradations, including additive noise, reverberation, clipping, and bandwidth limitations. The core advantage lies in Mamba's sequential modeling capabilities, enabling a unified approach across various sampling rates and distortion types, leading to enhanced robustness and generalization.

Quick Start & Requirements

  • Installation: Recommended setup involves creating a Conda environment (python=3.9), installing PyTorch 2.2.2, then pip install -r requirements.txt, followed by installing Mamba from source (cd mamba_install && pip install .). Docker environments for x86 and ARM are available.
  • Prerequisites: Python >= 3.9, CUDA >= 12.0, PyTorch == 2.2.2. Requires GPUs from the RTX series or newer (e.g., A100, RTX 4090, RTX 3090, GH200).
  • Links: Live Demo: https://huggingface.co/spaces/rc19477/Speech_Enhancement_Mamba.

Highlighted Details

  • Ranked 4th out of 70 teams in the URGENT challenge (IEEE SLT 2024), presenting at NeurIPS 2024.
  • Features a live HuggingFace demo for direct audio enhancement.
  • Offers pre-built Docker images for simplified deployment on x86 and ARM architectures.
  • Implements Perceptual Contrast Stretching (PCS) as an optional training target or post-processing step.

Maintenance & Community

No explicit community channels (e.g., Discord, Slack), roadmap, or detailed contributor information are provided in the README.

Licensing & Compatibility

The repository's license is not specified in the README, which is a critical omission for assessing commercial use or derivative works.

Limitations & Caveats

  • Hardware: Limited to RTX series GPUs and newer; older models like GTX 1080 Ti or Tesla V100 may not be supported.
  • CUDA Issues: Users experiencing CUDA problems are advised to switch to the mamba-2 branch for potential compatibility improvements.
  • Installation: Careful adherence to installation steps, including installing dependencies from source, is recommended to prevent conflicts.
Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.