image-sculpting  by vision-x-nyu

Image editing framework using 3D geometry

created 1 year ago
297 stars

Top 90.4% on sourcepulse

GitHubView on GitHub
Project Summary

Image Sculpting offers a novel framework for precise 2D image editing by leveraging 3D geometry. It targets researchers and artists seeking granular control over object manipulation, moving beyond ambiguous text-based edits to enable direct interaction with 3D models derived from single images.

How It Works

The core approach converts 2D objects into editable 3D representations. This allows for direct manipulation of pose, rotation, translation, and composition. After editing, the 3D models are re-rendered into the 2D image, with a coarse-to-fine enhancement process ensuring high-fidelity integration. This hybrid method combines generative model flexibility with the precision of traditional graphics pipelines.

Quick Start & Requirements

  • Install: Clone the repository, create a virtual environment, and install dependencies using pip install -r requirements.txt. PyTorch with CUDA 11.8 is required.
  • Prerequisites: NVIDIA RTX 4090 with CUDA 12.0 recommended. Background removal (e.g., Clipdrop) and Zero-1-to-3 XL model weights are needed for custom data.
  • Resources: Download provided reconstructed meshes from Google Drive. Setup for custom data involves 3D reconstruction using Zero-1-to-3 and potentially DreamBooth fine-tuning.
  • Links: Project Page, Video, Paper.

Highlighted Details

  • Enables precise editing operations like pose, rotation, translation, carving, and serial addition.
  • Integrates 3D geometry control with generative models for high-fidelity results.
  • Supports re-rendering and texture enhancement via DreamBooth fine-tuning.
  • Leverages Zero-1-to-3 for single-image 3D reconstruction.

Maintenance & Community

The project is associated with New York University and Intel Labs. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The README notes that while other deformation methods are possible, using bones is recommended for intuitive, physics-aware editing. Successful 3D reconstruction from single images may require careful preprocessing, including recentering and scaling.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.