Discover and explore top open-source AI tools and projects—updated daily.
Speech enhancement model using a dual-signal transformation LSTM network
Top 51.6% on SourcePulse
This repository provides a TensorFlow 2.x implementation of the Dual-signal Transformation LSTM Network (DTLN) for real-time speech denoising. It's designed for researchers and developers working on audio processing, noise suppression, and embedded systems, offering competitive performance with a small model footprint suitable for devices like the Raspberry Pi.
How It Works
DTLN combines a Short-Time Fourier Transform (STFT) with a learned analysis and synthesis basis in a stacked LSTM network. This approach leverages both magnitude spectral information and phase information from the learned basis, achieving state-of-the-art noise suppression with under one million parameters. The model is trained on extensive datasets, enabling real-time processing with low latency.
Quick Start & Requirements
conda
environment files (train_env.yml
, eval_env.yml
, tflite_env.yml
).python run_evaluation.py -i <input_folder> -o <output_folder> -m ./pretrained_model/model.h5
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 years ago
Inactive