wall-x  by X-Square-Robot

Robotics foundation models for embodied intelligence

Created 9 months ago
1,079 stars

Top 34.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides the training and inference code for the WALL series of embodied foundation models, aiming to build general-purpose robots. It addresses the challenge of creating truly generalizable AI by developing models that learn from continuous physical interaction, enabling robots to understand and act effectively in the world. The repository is targeted at researchers and engineers in robotics and AI who wish to leverage or build upon advanced embodied foundation models.

How It Works

The WALL series models are built upon an embodied foundation model approach, utilizing end-to-end pipelines for data preparation (via LeRobot), model configuration, and training with flow-matching and FAST action branches. A key design principle is establishing a direct feedback loop between the model's decisions and the robot's lived physical experience, fostering embodiment-aware vision-language understanding and robust manipulation capabilities.

Quick Start & Requirements

Setup involves creating a conda environment (python=3.10), activating it, and installing dependencies via pip install -r requirements.txt and flash-attn==2.7.4.post1. The lerobot library requires cloning a specific commit (c66cd401767e60baece16e1cf68da2824227e076) and installing it (pip install -e .). Finally, wall_x is installed after initializing submodules (git submodule update --init --recursive) using pip install --no-build-isolation --verbose -e .. Prerequisites include Python 3.10 and the lerobot library.

Highlighted Details

  • Introduces Wall-OSS-0.5, a deployment-ready Vision-Language-Action (VLA) model with gradient-bridged pretraining, offering zero-shot real-robot manipulation.
  • Features WALL-OSS, an end-to-end embodied foundation model designed for embodiment-aware vision-language understanding, strong language-action association, and robust manipulation.
  • Provides multiple pre-trained models available on Hugging Face, including WALL-OSS-0.5, WALL-OSS-FLOW, and WALL-OSS-FAST.
  • Includes scripts for basic action inference, open-loop evaluation, VQA inference, and Chain-of-Thought (COT) testing for embodied tasks.

Maintenance & Community

A community discussion group is available via a WeChat QR code. The project's foundational paper lists numerous authors, suggesting a broad research effort.

Licensing & Compatibility

The provided README does not explicitly state the software license. This omission requires clarification regarding usage rights, particularly for commercial applications or integration into closed-source systems.

Limitations & Caveats

Code for the recently announced Wall-OSS-0.5 model is noted as "coming soon," indicating potential incompleteness or delayed release. The installation process requires cloning a specific commit of the lerobot dependency, which may imply a reliance on older versions or potential compatibility issues with newer lerobot releases. The absence of explicit licensing information is a significant adoption blocker.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
5
Star History
225 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.