Wave-U-Net-Pytorch by f90

Pytorch implementation of Wave-U-Net for audio source separation

Created 6 years ago

363 stars

Top 77.4% on SourcePulse

Project Summary

This repository provides an improved PyTorch implementation of Wave-U-Net, a deep learning model for audio source separation. It targets researchers and practitioners in audio processing, offering enhanced scalability, configurability, and training speed for tasks like multi-instrument separation.

How It Works

The Wave-U-Net architecture employs a U-Net structure with convolutional layers adapted for audio waveforms. Improvements include multi-instrument separation by default (using separate models per source), increased scalability via a depth parameter for deeper convolutions, and enhanced configurability for layers, normalization, and residual connections. It also features optimized data preprocessing using HDF files for faster training and separate output convolutions for each source estimate.

Quick Start & Requirements

Install: Clone the repository and install dependencies via pip3 install -r requirements.txt. A virtual environment is recommended.
Prerequisites: Linux OS, Python 3.6, libsndfile, ffmpeg, CUDA 10.1 (for GPU).
Dataset: MUSDB18HQ for training. Pre-trained models are available for direct use.
Links: Original Wave-U-Net (Tensorflow) (Note: Link points to the PyTorch repo itself, not the original Tensorflow).

Highlighted Details

Supports multi-instrument separation by default.
Scalable architecture with a depth parameter for deeper convolutions.
Fast training via HDF5 preprocessed audio data.
Configurable layers, normalization, and residual connections.

Maintenance & Community

No specific community channels or notable contributors are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires a Linux-based OS and Python 3.6. Custom dataset usage requires manual code modification. The README does not specify a license, which may impact commercial adoption.

Wave-U-Net-Pytorch by f90

Explore Similar Projects

Audio-Deepfake-Detection by media-sec-lab

audio-development-tools by Yuan-ManX

NBSS by Audio-WestlakeU

tts by inworld-ai

soundstorm-pytorch by lucidrains

openvino-plugins-ai-audacity by intel

diffwave by lmnt-com

speech-denoising-wavenet by drethage

DTLN by breizhn

ast by YuanGongND

audiolm-pytorch by lucidrains

Kimi-Audio by MoonshotAI