Research paper implementation for zero-shot appearance transfer
Top 76.7% on sourcepulse
This repository provides the official implementation for "Cross-Image Attention for Zero-Shot Appearance Transfer," a SIGGRAPH 2024 paper. It enables users to transfer the visual appearance between objects with similar semantics but different shapes, leveraging the semantic understanding of text-to-image generative models. The primary audience is researchers and practitioners in computer vision and generative AI interested in zero-shot image manipulation.
How It Works
The core mechanism builds upon the self-attention layers of diffusion models. It introduces a cross-image attention mechanism that implicitly establishes semantic correspondences between two input images: one for structure and one for appearance. By combining queries from the structure image with keys and values from the appearance image during the denoising process, it generates an output image that merges the desired structure and appearance without requiring any training or optimization.
Quick Start & Requirements
conda env create -f environment/environment.yaml
followed by conda activate cross_image
.environment/environment.yaml
.python run.py --app_image_path /path/to/appearance/image.png --struct_image_path /path/to/structure/image.png --output_path /path/to/output/images.png --domain_name [domain]
.Highlighted Details
Maintenance & Community
The code builds upon the HuggingFace diffusers
library and borrows code from other repositories for inversion, masking, and generation quality improvements. Citation details are provided for academic use.
Licensing & Compatibility
The repository does not explicitly state a license in the README. However, its reliance on the diffusers
library suggests potential compatibility with its underlying license. Users should verify licensing for commercial or closed-source applications.
Limitations & Caveats
The domain_name
parameter is required when use_masked_adain
is True for mask computation, indicating potential limitations in handling poorly defined domains without this parameter. The project is presented as an official implementation of a research paper, implying it may be primarily for research purposes.
1 year ago
1 week