Discover and explore top open-source AI tools and projects—updated daily.
JiuTian-VLAdvancing robotic manipulation with large Vision-Language-Action models
Top 96.0% on SourcePulse
This repository serves as a comprehensive, curated survey and resource hub for large Vision-Language-Action (VLA) models applied to robotic manipulation. It addresses the growing need for robots that can interpret natural language, perceive complex environments, and execute diverse tasks with enhanced generalization, targeting researchers and engineers in AI and robotics. The project offers a structured overview of the rapidly evolving field, consolidating key papers, benchmarks, and resources for easier access and reference.
How It Works
The project systematically categorizes and lists research papers and resources related to large VLM-based VLA models for robotic manipulation. It organizes findings into key architectural paradigms, including monolithic (single and dual-system) and hierarchical models, alongside advanced fields like reinforcement learning, training-free methods, learning from human videos, and world model-based approaches. This structured compilation facilitates a deep understanding of the landscape and the diverse methodologies employed.
Quick Start & Requirements
This repository is a curated list of research resources, not a deployable software package. The primary resource is the survey paper: "Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey" arXiv. No specific installation or runtime requirements are listed for the repository itself.
Highlighted Details
Maintenance & Community
The project is actively maintained, with a note indicating "We're still cooking — Stay tuned!" and a commitment to continuously update the repository with newly published works. Community engagement is encouraged via GitHub pull requests for contributions. Contact information for the authors is provided for questions and suggestions.
Licensing & Compatibility
The repository is licensed under the MIT License, which generally permits broad use, modification, and distribution, including for commercial purposes, with minimal restrictions.
Limitations & Caveats
As a survey and curated list, this repository does not provide a unified codebase or a direct implementation of VLA models. The "still cooking" status suggests ongoing development and potential for future additions or revisions. The rapid pace of research in this domain means the landscape is constantly shifting.
1 week ago
Inactive