WonderWorld by KovenYu

Interactive 3D scene generation from a single image

Created 1 year ago

699 stars

Top 48.9% on SourcePulse

Project Summary

WonderWorld enables interactive 3D scene generation and exploration from a single input image. Targeting researchers and developers in computer graphics and AI, it facilitates the creation of large-scale, explorable 3D environments by iteratively generating new scene components based on user interaction and optional LLM-driven descriptions.

How It Works

The system leverages a diffusion-based approach for scene generation, guided by depth estimation and user input. It utilizes Pytorch3D for rendering and integrates models like Marigold for depth prediction and RepViT for scene understanding. Users interactively define new scene elements by navigating to novel viewpoints and providing textual prompts (either manually or via GPT-4), allowing for iterative expansion and refinement of the 3D environment.

Quick Start & Requirements

Installation: Clone the repository, create a mamba environment, and install dependencies including PyTorch (CUDA 12.4 tested), PyTorch3D, and submodules.
Prerequisites: CUDA-compatible GPU with 48GB VRAM is required. OpenAI API key is needed for GPT-4 integration. RepViT model download is necessary.
Setup: Installation of PyTorch3D can be time-consuming. Initial sky image generation for new examples takes approximately 20 minutes on an A6000 GPU.
Links: Website, arXiv

Highlighted Details

Interactive scene generation via novel viewpoint selection and prompt-based expansion.
Optional LLM integration (GPT-4) for automatic scene description generation.
Layer-wise scene generation and ability to load previously generated scenes.
Supports local visualization through a web browser via SSH tunneling.

Maintenance & Community

The project is associated with researchers from institutions like MIT and is linked to the authors' personal websites and Twitter. It acknowledges contributions from various open-source projects.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires a substantial 48GB of GPU memory, limiting accessibility. The installation process, particularly for PyTorch3D, can be complex and time-consuming. The license status requires clarification for commercial applications.

WonderWorld by KovenYu

Explore Similar Projects

awesome-4d-generation by cwchenwang

3DIS by limuloo

scene-language by zzyunzhi

awesome-3DGS by qqqqqqy0227

EmbodiedGen by HorizonRobotics

LayoutGPT by weixi-feng

lyra by nv-tlabs

Awesome-3D-Scene-Generation by hzxie

WonderJourney by KovenYu

MultiDiffusion by omerbt

HunyuanWorld-1.0 by Tencent-Hunyuan

LLFF by Fyusion