ddpo-pytorch  by kvablack

PyTorch implementation of DDPO for diffusion model finetuning

Created 2 years ago
690 stars

Top 49.3% on SourcePulse

GitHubView on GitHub
Project Summary

This repository implements Denoising Diffusion Policy Optimization (DDPO) in PyTorch for finetuning diffusion models, specifically targeting Stable Diffusion. It enables users to customize image generation based on user-defined prompts and reward functions, offering a flexible approach to aligning AI image generation with specific aesthetic or functional goals.

How It Works

DDPO frames diffusion model finetuning as a reinforcement learning problem. It generates images using a diffusion model, evaluates them with a reward function, and then updates the diffusion model's policy (its parameters) to maximize expected rewards. The implementation leverages LoRA for efficient finetuning, significantly reducing memory requirements.

Quick Start & Requirements

  • Install via pip install -e . after cloning the repository.
  • Requires Python 3.10+.
  • GPU memory: <10GB with LoRA enabled for Stable Diffusion finetuning.
  • Official quick-start: https://github.com/kvablack/ddpo-pytorch

Highlighted Details

  • Low GPU memory requirement (<10GB) with LoRA for Stable Diffusion finetuning.
  • Supports custom prompt and reward functions for tailored image generation.
  • Integrates with Hugging Face trl library for a DDPOTrainer.
  • Configuration files (config/base.py, config/dgx.py) provide example settings.

Maintenance & Community

  • The trl integration was contributed by @metric-space.
  • Supplementary blog post available for guidance.

Licensing & Compatibility

  • License not explicitly stated in the README.

Limitations & Caveats

  • Default hyperparameters are not optimized for performance and require adjustment for good results.
  • LLaVA prompt-image alignment experiments require dedicated GPUs for LLaVA inference.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.