AlignProp  by mihirp1998

Finetuning method for text-to-image diffusion models

created 1 year ago
293 stars

Top 91.2% on sourcepulse

GitHubView on GitHub
Project Summary

AlignProp offers a more efficient method for fine-tuning text-to-image diffusion models to align with specific reward functions, such as aesthetic quality or semantic accuracy. Targeting researchers and practitioners working with large diffusion models, it provides a computationally and sample-efficient alternative to reinforcement learning approaches like PPO.

How It Works

AlignProp utilizes direct reward backpropagation through the diffusion model's denoising process. To manage memory constraints, it fine-tunes low-rank adapter weight modules and employs gradient checkpointing. This approach allows for end-to-end optimization against differentiable reward functions, simplifying the alignment process compared to RL methods.

Quick Start & Requirements

  • Install: Create a conda environment (conda create -n alignprop python=3.10) and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.10, CUDA-enabled GPUs. Experiments used 4x A100 (40GB RAM); users with less VRAM should adjust train_batch_size or use K=1.
  • Training: Scripts aesthetic.sh and hps.sh are provided for aesthetic and HPSv2 reward models, respectively. Variants for memory-constrained environments (_k1.sh) are also available.
  • Resources: Official implementation built on DDPO. See arXiv and Website.

Highlighted Details

  • 25x more sample and compute efficient than PPO for Stable Diffusion fine-tuning.
  • Achieves higher rewards in fewer training steps.
  • Conceptually simpler than RL-based alignment methods.
  • Supports fine-tuning for image-text semantic alignment, aesthetics, compressibility, and controllability.

Maintenance & Community

The codebase is built upon DDPO. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The absence of a license may restrict commercial use or closed-source linking.

Limitations & Caveats

The project is presented as an official implementation of a research paper, suggesting it may be experimental. Specific details on long-term maintenance or community support are not provided.

Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.