HunyuanWorld-1.0 by Tencent-Hunyuan

Generate immersive 3D worlds from text or pixels

Created 5 months ago

2,599 stars

Top 17.9% on SourcePulse

Project Summary

HunyuanWorld-1.0 generates immersive, explorable, and interactive 3D worlds from text or pixel inputs. It targets developers in virtual reality, game development, and interactive content creation, offering a novel approach to 3D scene generation that combines panoramic proxies with mesh export and disentangled object representations.

How It Works

The framework utilizes a semantically layered 3D mesh representation, leveraging panoramic images as 360° world proxies. This approach facilitates semantic-aware decomposition and reconstruction, enabling the generation of diverse 3D worlds. Key advantages include 360° immersive experiences, mesh export for compatibility with existing graphics pipelines, and disentangled object representations for enhanced interactivity.

Quick Start & Requirements

Installation: Requires Python 3.10 and PyTorch 2.5.0+cu124. Installation involves cloning the repository, setting up a Conda environment (docker/HunyuanWorld.yaml), installing dependencies including Real-ESRGAN and ZIM, and logging into Hugging Face. Draco installation is also recommended for exporting to Draco format.
Prerequisites: CUDA 12.4 is specified.
Resources: Model checkpoints are available for download (e.g., HunyuanWorld-PanoDiT-Text: 478MB).
Documentation: Examples and a technical report are provided. A modelviewer.html is included for local visualization.

Highlighted Details

Achieves state-of-the-art performance in generating coherent, explorable, and interactive 3D worlds, outperforming baselines in visual quality and geometric consistency across various benchmarks.
Supports both text-to-panorama and image-to-panorama generation, followed by scene generation from panoramas.
Offers mesh export capabilities and disentangled object representations for interactivity.
The open-source version is based on Flux but adaptable to other models like Stable Diffusion.

Maintenance & Community

The project was released on July 26, 2025, with a technical report. Community channels include WeChat, Xiaohongshu, X (formerly Twitter), and Discord.

Licensing & Compatibility

The repository acknowledges contributions from various open-source projects. Specific licensing for HunyuanWorld-1.0 is not explicitly stated in the README, but compatibility with commercial use or closed-source linking would require clarification.

Limitations & Caveats

The README mentions that certain scenes may fail to load in the ModelViewer due to hardware limitations. A TensorRT version and RGBD video diffusion are listed as future open-source plans, indicating these features are not yet available.

HunyuanWorld-1.0 by Tencent-Hunyuan

Explore Similar Projects

scene-language by zzyunzhi

visionary by Visionary-Laboratory

MVEdit by Lakonik

richdreamer by modelscope

EmbodiedGen by HorizonRobotics

LayoutGPT by weixi-feng

Awesome-Text-to-3D by yyeboah

WonderWorld by KovenYu

WonderJourney by KovenYu

Hunyuan3D-2.1 by Tencent-Hunyuan

stable-dreamfusion by ashawkey

shap-e by openai