Palette-Image-to-Image-Diffusion-Models by Janspiry

PyTorch image-to-image diffusion model implementation

Created 3 years ago

1,782 stars

Top 24.0% on SourcePulse

Project Summary

This repository provides an unofficial PyTorch implementation of Palette: Image-to-Image Diffusion Models, targeting researchers and practitioners in generative AI. It offers a framework for various image-to-image tasks like inpainting, uncropping, and colorization, leveraging a U-Net architecture and attention mechanisms for enhanced sample quality.

How It Works

The implementation adapts the U-Net architecture from Guided-Diffusion, incorporating attention mechanisms in low-resolution features (16x16) similar to DDPM. It encodes the conditioning signal $\gamma$ directly, embedding it via affine transformation, and fixes the variance $\Sigma_\theta(x_t, t)$ to a constant during inference, as described in the Palette paper. This approach aims for robust performance and high-quality image generation.

Quick Start & Requirements

Install via pip install -r requirements.txt.
Requires Python.
Pre-trained models for Celeba-HQ and Places2 inpainting are available via Google Drive links.
Data preparation involves downloading datasets (CelebA-HQ, Places2, ImageNet) and modifying configuration files to point to data paths.
Official quick-start and demo scripts are available.

Highlighted Details

Achieved FID of 5.7873 and IS of 3.0705 for inpainting on Celeba-HQ with centering masks.
Supports multiple GPU training via DDP, EMA, and logging with Tensorboard.
Implemented pipelines for Diffusion Model, Train/Test Process, and saving/loading training states.
Includes dataset support for inpainting, uncropping, and colorization.

Maintenance & Community

The project is an unofficial implementation and does not list specific maintainers or community channels. It acknowledges inspiration from OpenAI's guided-diffusion and Diffusion-Based-Model-for-Colorization.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is an unofficial implementation and notes that follow-up experiments are uncertain due to time and GPU resource constraints. Some tasks like uncropping and colorization are marked as not yet implemented. The DDPM model requires significant computational resources.

Palette-Image-to-Image-Diffusion-Models by Janspiry

Explore Similar Projects

InstructCV by AlaaLab

DiffPIR by yuanzhi-zhu

glid-3-xl by Jack000

BLIP3o by JiuhaiChen

text2image-gui by n00mkrad

img2img-turbo by GaParmar

improved-diffusion by openai

sdnext by vladmandic

guided-diffusion by openai

mmagic by open-mmlab

Pytorch-UNet by milesial

pytorch-image-models by huggingface