HY-World-2.0 by Tencent-Hunyuan

Multi-modal framework for generating and reconstructing explorable 3D worlds

Created 3 months ago

2,346 stars

Top 18.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Toran Bruce Richards

Founder of AutoGPT

Project Summary

This project introduces HY-World 2.0, a multi-modal framework for generating and reconstructing 3D worlds from diverse inputs like text, images, and video. It targets researchers and developers seeking to create persistent, editable 3D environments, offering a significant advantage over ephemeral video-based world models by producing assets directly compatible with game engines.

How It Works

HY-World 2.0 employs a systematic pipeline, starting with panorama generation (HY-Pano 2.0), followed by trajectory planning (WorldNav), world expansion (WorldStereo 2.0), and composition (WorldMirror 2.0 + 3DGS learning). This approach allows it to synthesize high-fidelity, navigable 3D scenes (meshes/Gaussian Splattings) from text or single images. For reconstruction, WorldMirror 2.0 acts as a unified feed-forward model, predicting depth, normals, camera parameters, point clouds, and 3DGS attributes from multi-view images or videos in a single pass, enabling instant 3D digital twins.

Quick Start & Requirements

Installation involves cloning the repository, creating a Conda environment (Python 3.10), installing PyTorch with CUDA 12.4 support, and other dependencies. FlashAttention installation is recommended for performance. Detailed usage guides and parameter references are available in DOCUMENTATION.md.

Primary Install: git clone, conda create, pip install -r requirements.txt.
Prerequisites: CUDA 12.4 (recommended), Python 3.10.
Links: DOCUMENTATION.md (referenced within the repo).

Highlighted Details

Generates real 3D assets (3DGS, meshes, point clouds) directly importable into Unity, Unreal Engine, and Isaac Sim, unlike video-only models.
WorldMirror 2.0 enables instant 3D reconstruction from casual videos or multi-view images in a single forward pass.
Supports interactive character exploration within generated worlds, featuring first-person/third-person navigation and physics-based collision.

Maintenance & Community

The project is developed by "Team HY-World" and "Team HunyuanWorld." Specific community channels (like Discord/Slack) or detailed roadmaps beyond "Coming Soon" announcements are not explicitly detailed in the README.

Licensing & Compatibility

The project is described as open-source, but a specific license type (e.g., MIT, Apache 2.0) is not explicitly stated in the README. This lack of explicit licensing requires further investigation for commercial use or closed-source integration compatibility.

Limitations & Caveats

Several core components, including the full inference code for World Generation (WorldNav + World Composition) and WorldStereo 2.0 model weights/inference code, are marked as "Coming Soon." The specific open-source license is not clearly defined, posing a potential adoption blocker.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

148 stars in the last 30 days