Awesome-VLA-Papers  by Psi-Robot

Vision-Language-Action (VLA) research paper compilation

Created 6 months ago
311 stars

Top 86.3% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a comprehensive, curated bibliography of research papers focused on Vision-Language-Action (VLA) models, essential for researchers and practitioners in embodied AI and robotics. It systematically organizes seminal and recent works, offering a structured overview of advancements in VLA research, its foundational components, and diverse applications.

How It Works

The collection is meticulously organized by distinct approaches to integrating vision, language, and action modalities, primarily categorized by how actions are tokenized and represented. Key sections include "Language Description as Action Tokens," "Code as Action Tokens," "Affordance as Action Tokens," and "Reasoning as Action Tokens." It also details foundational language and vision models, specific VLA architectures, and related survey papers, with each entry providing direct links to publications, code repositories, pre-trained models, and official websites for in-depth exploration.

Highlighted Details

  • Action Tokenization Paradigms: The core strength lies in its detailed categorization of VLA research based on how actions are tokenized and integrated with vision and language, covering diverse paradigms like language descriptions, code, affordances, keypoints, and reasoning.
  • Broad Model Coverage: Encompasses a wide array of foundational models (e.g., Transformers, ViT, CLIP) and specific VLA architectures, spanning applications in robotics, autonomous driving, and generalist agents.
  • Rich Metadata and Links: Each listed paper includes direct links to its publication, associated code repositories, pre-trained models, datasets, and official websites, facilitating efficient access to research artifacts.
  • Survey and Dataset Compilations: Features dedicated sections for related survey papers and relevant datasets, providing broader context and resources for understanding the VLA research landscape and its data requirements.

Maintenance & Community

The README does not specify maintenance details or community channels for this repository itself. The individual research papers listed may have their own associated communities and development efforts.

Licensing & Compatibility

No licensing information is provided for this repository or the curated collection of research papers.

Limitations & Caveats

This repository is purely an informational resource, functioning as a curated bibliography of research papers. It does not provide any executable code, models, or direct tools for implementation, serving solely as a reference guide for the VLA research landscape.

Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
40 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.