Make-It-3D by junshutang

3D creation from a single image using diffusion prior (ICCV 2023)

Created 2 years ago

1,885 stars

Top 22.8% on SourcePulse

Project Summary

Make-It-3D addresses the challenge of generating high-fidelity 3D models from a single 2D image, a task complicated by the need to infer unseen geometry and textures. It targets researchers and developers in computer graphics and AI, offering a novel approach that leverages diffusion models for 3D-aware supervision, enabling applications like text-to-3D creation.

How It Works

The method employs a two-stage optimization pipeline. The first stage optimizes a neural radiance field (NeRF) using constraints from the input image and a diffusion prior for novel views. The second stage refines this coarse model into textured point clouds, further enhancing realism with the diffusion prior and high-quality textures from the original image. This diffusion prior acts as a powerful 3D-aware regularizer, guiding the reconstruction of unseen parts.

Quick Start & Requirements

Installation: pip install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio===0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html followed by several other pip install git+... commands for dependencies like tiny-cuda-nn, CLIP, diffusers, huggingface_hub, pytorch3d, and contextual_loss_pytorch. Also requires requirements.txt and raymarching.
Prerequisites: CUDA 11.3, Python 3.x, Hugging Face token for Stable Diffusion access. Requires pre-trained weights for DPT (depth estimation) and Segment Anything Model (SAM).
Setup: Requires cloning DPT and downloading weights. Estimated setup time is moderate due to multiple complex dependencies and model downloads.
Links: Project page: https://make-it-3d.github.io/

Highlighted Details

ICCV 2023 publication.
Leverages Stable Diffusion 2.0 as a diffusion prior.
Utilizes DPT for depth estimation and SAM for masking.
Supports text-conditioned 3D creation and texture editing.

Maintenance & Community

The project is associated with ICCV 2023. A Jittor implementation is also available. The README lists planned features (now completed) and acknowledges borrowing heavily from Stable-Dreamfusion.

Licensing & Compatibility

The repository does not explicitly state a license. The dependencies include libraries with various licenses (e.g., PyTorch, Hugging Face libraries). Commercial use may require careful review of all dependency licenses.

Limitations & Caveats

The method is noted to be challenging for complex scenes or images not featuring a single, centered object, potentially struggling with solid geometry reconstruction in such cases.

Make-It-3D by junshutang

Explore Similar Projects

PonderV2 by OpenGVLab

sjc by pals-ttic

awesome-3DGS by qqqqqqy0227

richdreamer by modelscope

SCube by nv-tlabs

autovfx by haoyuhsu

WonderWorld by KovenYu

Paint3D by OpenTexture

GaussianObject by chensjtu

zero123 by cvlab-columbia

DreamCraft3D by deepseek-ai

stable-dreamfusion by ashawkey