Neural_Gaffer  by Haian-Jin

2D relighting diffusion model for single-image object relighting

created 1 year ago
310 stars

Top 87.8% on sourcepulse

GitHubView on GitHub
Project Summary

Neural Gaffer is an end-to-end 2D relighting diffusion model that accurately relights any object in a single image under various lighting conditions. It targets researchers and practitioners in computer vision and graphics, enabling applications like text-based relighting, object insertion, and serving as a prior for 3D relighting tasks.

How It Works

The model employs a diffusion-based approach to relight objects. It takes a single input image and target lighting conditions to generate relit versions. A key aspect is its ability to act as a prior for 3D relighting, directly relighting radiance fields without requiring inverse rendering, which is a novel application of diffusion models in this domain.

Quick Start & Requirements

  • Installation: Use conda to create an environment, activate it, and then pip install -r requirements.txt and pip3 install -U xformers==0.0.28 --index-url https://download.pytorch.org/whl/cu118.
  • Prerequisites: Python 3.9, CUDA 11.8 (for xformers), PyTorch. Requires downloading checkpoints and datasets.
  • Resources: Inference for 2D relighting on a single A6000 GPU takes ~20 minutes for 2,400 images. 3D relighting inference takes ~24 minutes for 2,888 images. Training requires 8 GPUs.
  • Links: Project Page, Paper

Highlighted Details

  • End-to-end 2D relighting of single images.
  • Enables text-based relighting and object insertion.
  • Acts as a prior for 3D radiance field relighting.
  • Official code for NeurIPS 2024 paper.

Maintenance & Community

The project is associated with NeurIPS 2024. The primary contributor is Haian Jin. The README indicates a TODO list for future releases.

Licensing & Compatibility

The repository does not explicitly state a license. The codebase is built on top of Zero123-HF.

Limitations & Caveats

The model was trained at a 256x256 resolution, which limits its ability to preserve fine details and can lead to relighting failures. The VAE used in the base diffusion model struggles with identity preservation for detailed objects at this resolution. Finetuning at higher resolutions or using a more powerful diffusion model is suggested to mitigate this.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
29 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
12 more.

stablediffusion by Stability-AI

0.1%
41k
Latent diffusion model for high-resolution image synthesis
created 2 years ago
updated 1 month ago
Feedback? Help us improve.