Discover and explore top open-source AI tools and projects—updated daily.
Tencent-HunyuanMulti-modal framework for generating and reconstructing explorable 3D worlds
Top 20.7% on SourcePulse
This project introduces HY-World 2.0, a multi-modal framework for generating and reconstructing 3D worlds from diverse inputs like text, images, and video. It targets researchers and developers seeking to create persistent, editable 3D environments, offering a significant advantage over ephemeral video-based world models by producing assets directly compatible with game engines.
How It Works
HY-World 2.0 employs a systematic pipeline, starting with panorama generation (HY-Pano 2.0), followed by trajectory planning (WorldNav), world expansion (WorldStereo 2.0), and composition (WorldMirror 2.0 + 3DGS learning). This approach allows it to synthesize high-fidelity, navigable 3D scenes (meshes/Gaussian Splattings) from text or single images. For reconstruction, WorldMirror 2.0 acts as a unified feed-forward model, predicting depth, normals, camera parameters, point clouds, and 3DGS attributes from multi-view images or videos in a single pass, enabling instant 3D digital twins.
Quick Start & Requirements
Installation involves cloning the repository, creating a Conda environment (Python 3.10), installing PyTorch with CUDA 12.4 support, and other dependencies. FlashAttention installation is recommended for performance. Detailed usage guides and parameter references are available in DOCUMENTATION.md.
git clone, conda create, pip install -r requirements.txt.DOCUMENTATION.md (referenced within the repo).Highlighted Details
Maintenance & Community
The project is developed by "Team HY-World" and "Team HunyuanWorld." Specific community channels (like Discord/Slack) or detailed roadmaps beyond "Coming Soon" announcements are not explicitly detailed in the README.
Licensing & Compatibility
The project is described as open-source, but a specific license type (e.g., MIT, Apache 2.0) is not explicitly stated in the README. This lack of explicit licensing requires further investigation for commercial use or closed-source integration compatibility.
Limitations & Caveats
Several core components, including the full inference code for World Generation (WorldNav + World Composition) and WorldStereo 2.0 model weights/inference code, are marked as "Coming Soon." The specific open-source license is not clearly defined, posing a potential adoption blocker.
1 week ago
Inactive