wavegrad by lmnt-com

Neural vocoder for high-quality waveform generation from spectrograms

Created 5 years ago

295 stars

Top 89.8% on SourcePulse

Project Summary

WaveGrad is a neural vocoder that converts Mel spectrograms into high-quality audio waveforms through iterative refinement. It is designed for fast, high-fidelity speech synthesis, targeting researchers and developers in audio processing and speech synthesis.

How It Works

WaveGrad employs a diffusion model approach, specifically estimating gradients for waveform generation. It iteratively refines a noise signal into a waveform by applying learned denoising steps guided by the input Mel spectrogram. This method allows for high-quality synthesis and offers flexibility in inference speed by adjusting the noise schedule.

Quick Start & Requirements

Install via pip: pip install wavegrad or from source.
Requires Python and a GPU for efficient training and inference.
Training requires a dataset of 16-bit mono WAV files.
Preprocessed data and trained models are available.
Official documentation and audio samples are linked in the README.

Highlighted Details

Achieves high-quality synthesis with a diffusion model architecture.
Supports custom noise schedules for faster-than-real-time inference (as few as 6 iterations).
Includes command-line and programmatic inference APIs.
Supports mixed-precision and multi-GPU training.

Maintenance & Community

The project originated from Google Brain. The repository is hosted by lmnt-com. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Given the origin (Google Brain) and lack of explicit mention, users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The project was last updated in 2020. While marked as stable for training and synthesis, the lack of recent activity may indicate limited ongoing development or support. Finding optimal custom noise schedules may require additional effort.

wavegrad by lmnt-com

Explore Similar Projects

assem-vc by maum-ai

SpecVQGAN by v-iashin

WaveGrad by ivanvovk

VocGAN by rishikksh20

FastDiff by Rongjiehuang

vits2 by daniilrobnikov

melgan by seungwonpark

diffwave by lmnt-com

BigVGAN by NVIDIA

audiolm-pytorch by lucidrains

hifi-gan by jik876

tacotron2 by NVIDIA