Paint-by-Example  by Fantasy-Studio

Image editing research paper using exemplar guidance and diffusion

created 2 years ago
1,204 stars

Top 33.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code for "Paint by Example," an exemplar-based image editing technique that leverages diffusion models for precise control. It enables users to edit images by providing a reference image (exemplar) and a mask, allowing for high-fidelity modifications guided by the exemplar's style and content.

How It Works

The method utilizes a diffusion model, specifically a modified Stable Diffusion v1-4, to disentangle and reorganize source image and exemplar information. It addresses potential fusing artifacts by incorporating an information bottleneck and strong augmentations, preventing simple copy-pasting of exemplar content. An arbitrary shape mask for the exemplar and classifier-free guidance are employed to enhance controllability and similarity to the reference image, all within a single forward pass of the diffusion model.

Quick Start & Requirements

  • Install via conda env create -f environment.yaml and conda activate Paint-by-Example.
  • Requires a pre-trained Stable Diffusion v1-4 model downloaded and placed in pretrained_models/.
  • A script scripts/modify_checkpoints.py is needed to adapt the Stable Diffusion checkpoint.
  • Official Huggingface Demo: https://huggingface.co/spaces/Bingsheng/Paint-by-Example

Highlighted Details

  • Achieves impressive performance and controllable editing on in-the-wild images with high fidelity.
  • Includes a custom test benchmark (COCOEE) for quantitative analysis.
  • Supports FID, QS, and CLIP score evaluations.
  • Code borrows heavily from Stable Diffusion and OpenAI's ADM codebase.

Maintenance & Community

  • Issues can be opened on GitHub for support.
  • Contact information for technical questions is available.

Licensing & Compatibility

  • Code and pre-trained model are under the CreativeML OpenRAIL M license.
  • The COCOEE test benchmark is licensed under Creative Commons Attribution 4.0 License.

Limitations & Caveats

The project mentions a recent work, Asymmetric VQGAN, that improves detail preservation in non-masked regions, suggesting potential limitations in the current implementation's detail handling.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 90 days

Explore Similar Projects

Starred by Dan Abramov Dan Abramov(Core Contributor to React), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
28 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
created 3 years ago
updated 1 year ago
Feedback? Help us improve.