Discover and explore top open-source AI tools and projects—updated daily.
sylvestfBenchmark for analyzing Vision-Language-Action model robustness
Top 92.1% on SourcePulse
This repository provides LIBERO-plus, a generalized benchmark designed for in-depth robustness analysis of Vision-Language-Action (VLA) models. It systematically exposes vulnerabilities in contemporary VLA models through comprehensive evaluations across seven perturbation dimensions, enabling researchers and engineers to rigorously assess and improve model resilience. The primary benefit is a standardized, detailed methodology for understanding VLA model weaknesses in realistic, varied conditions.
How It Works
LIBERO-plus introduces a benchmark suite comprising 10,030 tasks, designed to evaluate VLA models against seven distinct perturbation categories: Objects Layout, Camera Viewpoints, Robot Initial States, Language Instructions, Light Conditions, Background Textures, and Sensor Noise. This approach allows for the identification of specific failure modes, such as extreme sensitivity to environmental changes or a lack of genuine language understanding, offering a more profound insight into model robustness than standard benchmarks.
Quick Start & Requirements
pip install -e ..apt install libexpat1, apt install libfontconfig1-dev, apt install libpython3-stdlib, apt-get install libmagickwand-dev, and pip install -r extra_requirements.txt. Additionally, users must download and unzip the project's assets to the /LIBERO-plus/libero/libero/assets/ directory.Highlighted Details
Maintenance & Community
The project aims to be community-driven, encouraging users to submit pull requests to add their research works that adopt LIBERO-Plus. Specific community channels like Discord or Slack are not detailed in the README.
Licensing & Compatibility
The provided README does not explicitly state the software license for the LIBERO-plus repository. This absence of clear licensing information may pose compatibility concerns for commercial use or integration into closed-source projects.
Limitations & Caveats
The benchmark highlights inherent limitations in current VLA models, such as extreme sensitivity to environmental variations and a tendency to disregard language instructions. The setup process requires careful uninstallation of prior LIBERO versions and manual asset management, which could be a minor adoption hurdle. The lack of explicit licensing is a significant caveat for potential adopters.
2 months ago
Inactive
meta-pytorch