HY-World-2.0  by Tencent-Hunyuan

Multi-modal framework for generating and reconstructing explorable 3D worlds

Created 1 month ago
2,105 stars

Top 20.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project introduces HY-World 2.0, a multi-modal framework for generating and reconstructing 3D worlds from diverse inputs like text, images, and video. It targets researchers and developers seeking to create persistent, editable 3D environments, offering a significant advantage over ephemeral video-based world models by producing assets directly compatible with game engines.

How It Works

HY-World 2.0 employs a systematic pipeline, starting with panorama generation (HY-Pano 2.0), followed by trajectory planning (WorldNav), world expansion (WorldStereo 2.0), and composition (WorldMirror 2.0 + 3DGS learning). This approach allows it to synthesize high-fidelity, navigable 3D scenes (meshes/Gaussian Splattings) from text or single images. For reconstruction, WorldMirror 2.0 acts as a unified feed-forward model, predicting depth, normals, camera parameters, point clouds, and 3DGS attributes from multi-view images or videos in a single pass, enabling instant 3D digital twins.

Quick Start & Requirements

Installation involves cloning the repository, creating a Conda environment (Python 3.10), installing PyTorch with CUDA 12.4 support, and other dependencies. FlashAttention installation is recommended for performance. Detailed usage guides and parameter references are available in DOCUMENTATION.md.

  • Primary Install: git clone, conda create, pip install -r requirements.txt.
  • Prerequisites: CUDA 12.4 (recommended), Python 3.10.
  • Links: DOCUMENTATION.md (referenced within the repo).

Highlighted Details

  • Generates real 3D assets (3DGS, meshes, point clouds) directly importable into Unity, Unreal Engine, and Isaac Sim, unlike video-only models.
  • WorldMirror 2.0 enables instant 3D reconstruction from casual videos or multi-view images in a single forward pass.
  • Supports interactive character exploration within generated worlds, featuring first-person/third-person navigation and physics-based collision.

Maintenance & Community

The project is developed by "Team HY-World" and "Team HunyuanWorld." Specific community channels (like Discord/Slack) or detailed roadmaps beyond "Coming Soon" announcements are not explicitly detailed in the README.

Licensing & Compatibility

The project is described as open-source, but a specific license type (e.g., MIT, Apache 2.0) is not explicitly stated in the README. This lack of explicit licensing requires further investigation for commercial use or closed-source integration compatibility.

Limitations & Caveats

Several core components, including the full inference code for World Generation (WorldNav + World Composition) and WorldStereo 2.0 model weights/inference code, are marked as "Coming Soon." The specific open-source license is not clearly defined, posing a potential adoption blocker.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
8
Star History
444 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.