HunyuanWorld-1.0  by Tencent-Hunyuan

Generate immersive 3D worlds from text or pixels

Created 2 months ago
2,166 stars

Top 20.9% on SourcePulse

GitHubView on GitHub
Project Summary

HunyuanWorld-1.0 generates immersive, explorable, and interactive 3D worlds from text or pixel inputs. It targets developers in virtual reality, game development, and interactive content creation, offering a novel approach to 3D scene generation that combines panoramic proxies with mesh export and disentangled object representations.

How It Works

The framework utilizes a semantically layered 3D mesh representation, leveraging panoramic images as 360° world proxies. This approach facilitates semantic-aware decomposition and reconstruction, enabling the generation of diverse 3D worlds. Key advantages include 360° immersive experiences, mesh export for compatibility with existing graphics pipelines, and disentangled object representations for enhanced interactivity.

Quick Start & Requirements

  • Installation: Requires Python 3.10 and PyTorch 2.5.0+cu124. Installation involves cloning the repository, setting up a Conda environment (docker/HunyuanWorld.yaml), installing dependencies including Real-ESRGAN and ZIM, and logging into Hugging Face. Draco installation is also recommended for exporting to Draco format.
  • Prerequisites: CUDA 12.4 is specified.
  • Resources: Model checkpoints are available for download (e.g., HunyuanWorld-PanoDiT-Text: 478MB).
  • Documentation: Examples and a technical report are provided. A modelviewer.html is included for local visualization.

Highlighted Details

  • Achieves state-of-the-art performance in generating coherent, explorable, and interactive 3D worlds, outperforming baselines in visual quality and geometric consistency across various benchmarks.
  • Supports both text-to-panorama and image-to-panorama generation, followed by scene generation from panoramas.
  • Offers mesh export capabilities and disentangled object representations for interactivity.
  • The open-source version is based on Flux but adaptable to other models like Stable Diffusion.

Maintenance & Community

The project was released on July 26, 2025, with a technical report. Community channels include WeChat, Xiaohongshu, X (formerly Twitter), and Discord.

Licensing & Compatibility

The repository acknowledges contributions from various open-source projects. Specific licensing for HunyuanWorld-1.0 is not explicitly stated in the README, but compatibility with commercial use or closed-source linking would require clarification.

Limitations & Caveats

The README mentions that certain scenes may fail to load in the ModelViewer due to hardware limitations. A TensorRT version and RGBD video diffusion are listed as future open-source plans, indicating these features are not yet available.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
13
Star History
261 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.2%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 9 months ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.