NeuralLift-360 by VITA-Group

Lift 2D photos to 3D objects

Created 3 years ago

318 stars

Top 85.4% on SourcePulse

Project Summary

NeuralLift-360 addresses the challenge of reconstructing a 3D object with 360° views from a single 2D image. It is designed for researchers and developers in computer vision and graphics interested in novel view synthesis and 3D reconstruction from limited input. The project enables the creation of multi-view representations from single images, facilitating applications in virtual reality, augmented reality, and content creation.

How It Works

The project leverages a diffusion model, building upon the Stable DreamFusion codebase, to generate novel views. It incorporates depth estimation from external tools and foreground masking to guide the 3D reconstruction process. Text inversion is optionally used to improve the text embeddings for better object representation during the diffusion process. This approach allows for the generation of a 3D object from a single 2D input by iteratively refining the object's representation across multiple views.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
For Gradio App: pip install gradio
Depth estimation requires external tools (Boost Your Own Depth, LeRes).
Foreground mask generation requires a separate tool (e.g., https://github.com/Ir1d/image-background-remove-tool).
Text inversion requires accelerate and a pre-trained Stable Diffusion model (e.g., runwayml/stable-diffusion-v1-5).
Official website: [Website]
Colab notebook for depth export: [Colab Notebook Link]

Highlighted Details

CVPR 2023 Highlight paper.
Basic workflow and Gradio App released.
Supports optional text inversion for improved embeddings.
Generates output videos (e.g., lift_ep0100_rgb.mp4) for testing.

Maintenance & Community

Codebase is based on https://github.com/ashawkey/stable-dreamfusion.
Acknowledgement to Jiaxiang Tang for discussions.
Citation details provided for academic use.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The Gradio App is noted to be slower than direct script execution due to on-the-fly rendering. Configuration loading is currently from pre-defined YAML files, with plans for updates. Imagic finetuning functionality is listed as "Coming soon."

NeuralLift-360 by VITA-Group

Explore Similar Projects

OmniGen2 by VectorSpaceLab

MVEdit by Lakonik

ShapeLLM-Omni by JAMESYJL

autovfx by haoyuhsu

LayoutGPT by weixi-feng

ZenCtrl by FotographerAI

Lumina-mGPT by Alpha-VLLM

Awesome-Text-to-3D by yyeboah

stable-virtual-camera by Stability-AI

HunyuanWorld-1.0 by Tencent-Hunyuan

threestudio by threestudio-project

disco-diffusion by alembics