ddpo-pytorch  by kvablack

PyTorch implementation of DDPO for diffusion model finetuning

created 2 years ago
646 stars

Top 52.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository implements Denoising Diffusion Policy Optimization (DDPO) in PyTorch for finetuning diffusion models, specifically targeting Stable Diffusion. It enables users to customize image generation based on user-defined prompts and reward functions, offering a flexible approach to aligning AI image generation with specific aesthetic or functional goals.

How It Works

DDPO frames diffusion model finetuning as a reinforcement learning problem. It generates images using a diffusion model, evaluates them with a reward function, and then updates the diffusion model's policy (its parameters) to maximize expected rewards. The implementation leverages LoRA for efficient finetuning, significantly reducing memory requirements.

Quick Start & Requirements

  • Install via pip install -e . after cloning the repository.
  • Requires Python 3.10+.
  • GPU memory: <10GB with LoRA enabled for Stable Diffusion finetuning.
  • Official quick-start: https://github.com/kvablack/ddpo-pytorch

Highlighted Details

  • Low GPU memory requirement (<10GB) with LoRA for Stable Diffusion finetuning.
  • Supports custom prompt and reward functions for tailored image generation.
  • Integrates with Hugging Face trl library for a DDPOTrainer.
  • Configuration files (config/base.py, config/dgx.py) provide example settings.

Maintenance & Community

  • The trl integration was contributed by @metric-space.
  • Supplementary blog post available for guidance.

Licensing & Compatibility

  • License not explicitly stated in the README.

Limitations & Caveats

  • Default hyperparameters are not optimized for performance and require adjustment for good results.
  • LLaVA prompt-image alignment experiments require dedicated GPUs for LLaVA inference.
Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
92 stars in the last 90 days

Explore Similar Projects

Starred by Dan Abramov Dan Abramov(Core Contributor to React), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
28 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
created 3 years ago
updated 1 year ago
Feedback? Help us improve.