HEX  by Open-X-Humanoid

Humanoid robots executing skilled manipulation

Created 3 weeks ago

New!

490 stars

Top 62.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary HEX is a whole-body vision-language-action (VLA) framework designed for full-sized humanoid robots, enabling cross-embodiment whole-body manipulation. It addresses the challenge of transferring learned policies across different humanoid platforms and performing long-horizon tasks by aligning heterogeneous states and learning predictive body dynamics.

How It Works HEX integrates a Qwen-VL backbone for perception and language understanding with a novel flow-matching action head. Its core innovation is aligning diverse humanoid states into shared body-part representations, facilitating the learning of predictive body dynamics from cross-embodiment data. This allows the framework to generalize across different humanoid embodiments. During deployment, HEX predicts high-level actions for arms, hands, and waist, coordinating with a separate low-level controller for leg movements to achieve stable manipulation.

Quick Start & Requirements

  • Installation: Clone repo, create Conda env (python=3.10), install system libs (libegl1-mesa-dev, libglu1-mesa), install Python requirements (pip install -r requirements.txt), install FlashAttention2 (manual wheel download may be needed), and install HEX (pip install -e .).
  • Prerequisites: Python 3.10, PyTorch, CUDA (version compatibility for FlashAttention2 is critical), libegl1-mesa-dev, libglu1-mesa.
  • Models: Requires separate download of pretrained HEX checkpoints and the Qwen3-VL base model.
  • Links: HEX Checkpoint: 🤗 HEX-model. Evaluation notebook: notebooks/eval_model.ipynb.

Highlighted Details

  • Enables cross-embodiment transfer and long-horizon whole-body manipulation for humanoid robots.
  • Open-sources 8 real-world evaluation task datasets and training data from multiple humanoid platforms.
  • Provides pretraining and fine-tuning code for the VLA framework.

Maintenance & Community The provided README does not detail specific maintenance schedules, notable contributors, sponsorships, or community channels (e.g., Discord, Slack).

Licensing & Compatibility The core software license is not explicitly stated. However, the README notes commercial restrictions on the data collection pipeline for Tienkung series robots and that the associated low-level whole-body controller is not open-sourced. This suggests potential limitations for commercial use or integration without further clarification.

Limitations & Caveats The data collection pipeline and low-level controller for Tienkung robots are proprietary. Users may encounter installation challenges with FlashAttention2 requiring manual wheel selection based on PyTorch/CUDA versions. The framework necessitates downloading external Qwen3-VL models.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
497 stars in the last 24 days

Explore Similar Projects

Feedback? Help us improve.