Neural network for end-to-end speech denoising
Top 49.7% on sourcepulse
This project provides a neural network for end-to-end speech denoising, implementing a WaveNet architecture. It is targeted at researchers and developers working on audio processing and speech enhancement, offering a pre-trained model for immediate use and clear instructions for training and inference.
How It Works
The project utilizes a WaveNet architecture, known for its effectiveness in modeling sequential data like audio. This approach allows for deep convolutional layers with increasing dilation rates, enabling the model to capture long-range dependencies in the audio signal without requiring recurrent connections. This design is advantageous for speech denoising as it can effectively learn the complex patterns of speech and noise.
Quick Start & Requirements
pip install -r requirements.txt
.optimizer=fast_compile
and device=gpu
are recommended for usage.data/NSDTSEA
.Highlighted Details
target_field_length
for faster denoising.config.md
.Maintenance & Community
No specific information on contributors, sponsorships, or community channels (like Discord/Slack) is provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project has a strict dependency on older versions of Keras (1.2) and Theano (0.9.0), which may pose significant challenges for setup and compatibility with modern deep learning environments. TensorFlow 1.2.0 is explicitly stated as unsupported.
2 years ago
Inactive