NeuralLift-360  by VITA-Group

Lift 2D photos to 3D objects

created 2 years ago
318 stars

Top 86.3% on sourcepulse

GitHubView on GitHub
Project Summary

NeuralLift-360 addresses the challenge of reconstructing a 3D object with 360° views from a single 2D image. It is designed for researchers and developers in computer vision and graphics interested in novel view synthesis and 3D reconstruction from limited input. The project enables the creation of multi-view representations from single images, facilitating applications in virtual reality, augmented reality, and content creation.

How It Works

The project leverages a diffusion model, building upon the Stable DreamFusion codebase, to generate novel views. It incorporates depth estimation from external tools and foreground masking to guide the 3D reconstruction process. Text inversion is optionally used to improve the text embeddings for better object representation during the diffusion process. This approach allows for the generation of a 3D object from a single 2D input by iteratively refining the object's representation across multiple views.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • For Gradio App: pip install gradio
  • Depth estimation requires external tools (Boost Your Own Depth, LeRes).
  • Foreground mask generation requires a separate tool (e.g., https://github.com/Ir1d/image-background-remove-tool).
  • Text inversion requires accelerate and a pre-trained Stable Diffusion model (e.g., runwayml/stable-diffusion-v1-5).
  • Official website: [Website]
  • Colab notebook for depth export: [Colab Notebook Link]

Highlighted Details

  • CVPR 2023 Highlight paper.
  • Basic workflow and Gradio App released.
  • Supports optional text inversion for improved embeddings.
  • Generates output videos (e.g., lift_ep0100_rgb.mp4) for testing.

Maintenance & Community

  • Codebase is based on https://github.com/ashawkey/stable-dreamfusion.
  • Acknowledgement to Jiaxiang Tang for discussions.
  • Citation details provided for academic use.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The Gradio App is noted to be slower than direct script execution due to on-the-fly rendering. Configuration loading is currently from pre-defined YAML files, with plans for updates. Imagic finetuning functionality is listed as "Coming soon."

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.