3D creation from a single image using diffusion prior (ICCV 2023)
Top 23.7% on sourcepulse
Make-It-3D addresses the challenge of generating high-fidelity 3D models from a single 2D image, a task complicated by the need to infer unseen geometry and textures. It targets researchers and developers in computer graphics and AI, offering a novel approach that leverages diffusion models for 3D-aware supervision, enabling applications like text-to-3D creation.
How It Works
The method employs a two-stage optimization pipeline. The first stage optimizes a neural radiance field (NeRF) using constraints from the input image and a diffusion prior for novel views. The second stage refines this coarse model into textured point clouds, further enhancing realism with the diffusion prior and high-quality textures from the original image. This diffusion prior acts as a powerful 3D-aware regularizer, guiding the reconstruction of unseen parts.
Quick Start & Requirements
pip install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio===0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
followed by several other pip install git+...
commands for dependencies like tiny-cuda-nn
, CLIP
, diffusers
, huggingface_hub
, pytorch3d
, and contextual_loss_pytorch
. Also requires requirements.txt
and raymarching
.Highlighted Details
Maintenance & Community
The project is associated with ICCV 2023. A Jittor implementation is also available. The README lists planned features (now completed) and acknowledges borrowing heavily from Stable-Dreamfusion.
Licensing & Compatibility
The repository does not explicitly state a license. The dependencies include libraries with various licenses (e.g., PyTorch, Hugging Face libraries). Commercial use may require careful review of all dependency licenses.
Limitations & Caveats
The method is noted to be challenging for complex scenes or images not featuring a single, centered object, potentially struggling with solid geometry reconstruction in such cases.
1 year ago
Inactive