Discover and explore top open-source AI tools and projects—updated daily.
physical-superintelligence-labHumanoid loco-manipulation foundation model
Top 28.0% on SourcePulse
Psi-Zero: A Foundation Model for Humanoid Loco-Manipulation
Psi-Zero is an open foundation model designed for dexterous humanoid loco-manipulation, aiming to advance universal humanoid intelligence. It addresses the challenge of rapidly acquiring new, complex manipulation skills by enabling fine-tuning with minimal real-world data. This project targets researchers and engineers in robotics and AI, offering a powerful base model that significantly reduces the data and time required to teach robots new tasks.
How It Works
The Psi-Zero architecture comprises two primary end-to-end trained components: a vision-language backbone (System-2) and a multimodal diffusion transformer action expert (System-1). The backbone, based on Qwen3-VL-2B-Instruct, extracts features from observations and instructions. These features condition a flow-based multimodal diffusion transformer, inspired by Stable Diffusion 3, which predicts future whole-body action chunks. At the lowest level (System-0), an RL-based tracking controller ensures precise physical execution of the predicted actions. This approach allows the model to learn task semantics and visual representations from large-scale egocentric videos, then adapt to real-world embodiment dynamics through post-training on limited teleoperated robot data.
Quick Start & Requirements
Installation involves cloning the repository and managing Python dependencies with uv. Key commands include setting up a virtual environment (uv venv .venv-psi, source .venv-psi/bin/activate) and synchronizing packages (uv sync --all-groups). A specific requirement is flash_attn==2.7.4.post1, and Python 3.10 is used for environment management. Pre-trained models and datasets are available on Hugging Face.
Highlighted Details
Maintenance & Community
The project lists several contributors. No explicit links to community channels (e.g., Discord, Slack) or a public roadmap were found in the provided README.
Licensing & Compatibility
This project is licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and linking in closed-source projects.
Limitations & Caveats
Installation of the SIMPLE humanoid benchmarking simulator is marked as "Coming soon." Similarly, motion-planning based data generation and teleoperation within the simulator are also pending. The troubleshooting section indicates potential issues with specific dependencies like lerobot stack, evdev, and wandb, as well as considerations for GPU memory on newer hardware.
2 days ago
Inactive
octo-models
NVIDIA