Interactive 3D scene generation from a single image
Top 54.7% on sourcepulse
WonderWorld enables interactive 3D scene generation and exploration from a single input image. Targeting researchers and developers in computer graphics and AI, it facilitates the creation of large-scale, explorable 3D environments by iteratively generating new scene components based on user interaction and optional LLM-driven descriptions.
How It Works
The system leverages a diffusion-based approach for scene generation, guided by depth estimation and user input. It utilizes Pytorch3D for rendering and integrates models like Marigold for depth prediction and RepViT for scene understanding. Users interactively define new scene elements by navigating to novel viewpoints and providing textual prompts (either manually or via GPT-4), allowing for iterative expansion and refinement of the 3D environment.
Quick Start & Requirements
mamba
environment, and install dependencies including PyTorch (CUDA 12.4 tested), PyTorch3D, and submodules.Highlighted Details
Maintenance & Community
The project is associated with researchers from institutions like MIT and is linked to the authors' personal websites and Twitter. It acknowledges contributions from various open-source projects.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires a substantial 48GB of GPU memory, limiting accessibility. The installation process, particularly for PyTorch3D, can be complex and time-consuming. The license status requires clarification for commercial applications.
3 months ago
1 week