Pytorch implementation of Wave-U-Net for audio source separation
Top 80.3% on sourcepulse
This repository provides an improved PyTorch implementation of Wave-U-Net, a deep learning model for audio source separation. It targets researchers and practitioners in audio processing, offering enhanced scalability, configurability, and training speed for tasks like multi-instrument separation.
How It Works
The Wave-U-Net architecture employs a U-Net structure with convolutional layers adapted for audio waveforms. Improvements include multi-instrument separation by default (using separate models per source), increased scalability via a depth parameter for deeper convolutions, and enhanced configurability for layers, normalization, and residual connections. It also features optimized data preprocessing using HDF files for faster training and separate output convolutions for each source estimate.
Quick Start & Requirements
pip3 install -r requirements.txt
. A virtual environment is recommended.libsndfile
, ffmpeg
, CUDA 10.1 (for GPU).Highlighted Details
Maintenance & Community
No specific community channels or notable contributors are mentioned in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires a Linux-based OS and Python 3.6. Custom dataset usage requires manual code modification. The README does not specify a license, which may impact commercial adoption.
1 year ago
1+ week