Research paper code for emergent semantic correspondence via image diffusion
Top 49.1% on sourcepulse
This repository provides Diffusion Features (DIFT), a method for extracting dense semantic correspondences from images using diffusion models. It is designed for researchers and practitioners in computer vision and machine learning who need robust feature extraction for tasks like image editing, segmentation, and object matching. DIFT leverages emergent properties of diffusion models to establish correspondences, offering a novel approach to feature representation.
How It Works
DIFT extracts features by querying intermediate layers of pre-trained diffusion models (Stable Diffusion and ADM) at specific timesteps and U-Net layers. This approach capitalizes on the diffusion process's ability to capture rich semantic information at various scales. By analyzing the feature maps, DIFT identifies corresponding points across images, even for objects with significant appearance or viewpoint changes.
Quick Start & Requirements
conda env create -f environment.yml
and conda activate dift
.environment.yml
or setup_env.sh
), PyTorch.demo.ipynb
) and a Colab demo are available for trying out DIFT.python extract_dift.py
with specified input/output paths, image size, timestep (t
), U-Net layer index (up_ft_index
), prompt, and ensemble size.Highlighted Details
Maintenance & Community
The project is associated with the NeurIPS 2023 paper "Emergent Correspondence from Image Diffusion" by Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, and Bharath Hariharan.
Licensing & Compatibility
The repository does not explicitly state a license. The code relies on external diffusion models (Stable Diffusion, ADM) which have their own licenses. Compatibility for commercial use or closed-source linking would require verification of these underlying model licenses.
Limitations & Caveats
The README mentions that ensemble_size
and img_size
can be reduced if memory issues are encountered, suggesting potential high resource requirements. Specific parameter choices (t
, up_ft_index
) are crucial for optimal performance on different tasks and models.
1 year ago
1 week