HunyuanWorld-1.0  by Tencent-Hunyuan

Generate immersive 3D worlds from text or pixels

Created 5 months ago
2,599 stars

Top 17.9% on SourcePulse

GitHubView on GitHub
Project Summary

HunyuanWorld-1.0 generates immersive, explorable, and interactive 3D worlds from text or pixel inputs. It targets developers in virtual reality, game development, and interactive content creation, offering a novel approach to 3D scene generation that combines panoramic proxies with mesh export and disentangled object representations.

How It Works

The framework utilizes a semantically layered 3D mesh representation, leveraging panoramic images as 360° world proxies. This approach facilitates semantic-aware decomposition and reconstruction, enabling the generation of diverse 3D worlds. Key advantages include 360° immersive experiences, mesh export for compatibility with existing graphics pipelines, and disentangled object representations for enhanced interactivity.

Quick Start & Requirements

  • Installation: Requires Python 3.10 and PyTorch 2.5.0+cu124. Installation involves cloning the repository, setting up a Conda environment (docker/HunyuanWorld.yaml), installing dependencies including Real-ESRGAN and ZIM, and logging into Hugging Face. Draco installation is also recommended for exporting to Draco format.
  • Prerequisites: CUDA 12.4 is specified.
  • Resources: Model checkpoints are available for download (e.g., HunyuanWorld-PanoDiT-Text: 478MB).
  • Documentation: Examples and a technical report are provided. A modelviewer.html is included for local visualization.

Highlighted Details

  • Achieves state-of-the-art performance in generating coherent, explorable, and interactive 3D worlds, outperforming baselines in visual quality and geometric consistency across various benchmarks.
  • Supports both text-to-panorama and image-to-panorama generation, followed by scene generation from panoramas.
  • Offers mesh export capabilities and disentangled object representations for interactivity.
  • The open-source version is based on Flux but adaptable to other models like Stable Diffusion.

Maintenance & Community

The project was released on July 26, 2025, with a technical report. Community channels include WeChat, Xiaohongshu, X (formerly Twitter), and Discord.

Licensing & Compatibility

The repository acknowledges contributions from various open-source projects. Specific licensing for HunyuanWorld-1.0 is not explicitly stated in the README, but compatibility with commercial use or closed-source linking would require clarification.

Limitations & Caveats

The README mentions that certain scenes may fail to load in the ModelViewer due to hardware limitations. A TensorRT version and RGBD video diffusion are listed as future open-source plans, indicating these features are not yet available.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
95 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.