Discover and explore top open-source AI tools and projects—updated daily.
Accelerate diffusion model training
Top 97.7% on SourcePulse
This repository provides an efficient diffusion model training strategy, Min-SNR weighting, designed to accelerate convergence and improve sample quality for image generation tasks. It is targeted at researchers and practitioners working with diffusion models who aim to reduce training time and achieve state-of-the-art results.
How It Works
The Min-SNR weighting strategy addresses slow diffusion model convergence by treating training as a multi-task learning problem. It adaptively adjusts loss weights for different timesteps based on clamped signal-to-noise ratios (SNRs). This approach effectively balances conflicting optimization objectives across timesteps, leading to significantly faster convergence compared to previous methods.
Quick Start & Requirements
bash configs/in256/vit-b_layer12_lr1e-4_099_099_pred_x0__min_snr_5__fp16_bs8x32.sh <GPUS> <BATCH_SIZE_PER_GPU>
bash configs/in256/inference.sh
or bash configs/in256/inference_limited_interval_guidance.sh
bash configs/in64/inference.sh
fp16
), ImageNet or CelebA datasets. ImageNet-256 requires pre-processing with AutoencoderKL from HuggingFace Diffusers.Highlighted Details
diffusers
and k-diffusion
.Maintenance & Community
The project is based on openai/guided-diffusion
and uses sampling/FID evaluation from NVlabs/edm
. It has seen adoption in projects like PLAID and MuLan.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. However, its reliance on openai/guided-diffusion
and NVlabs/edm
suggests potential licensing considerations for commercial use.
Limitations & Caveats
The README does not specify the exact license, which could impact commercial adoption. The training scripts are configured for specific model architectures (e.g., ViT-B) and require dataset preparation.
9 months ago
Inactive