Discover and explore top open-source AI tools and projects—updated daily.
Psi-RobotVision-Language-Action (VLA) research paper compilation
Top 86.3% on SourcePulse
This repository serves as a comprehensive, curated bibliography of research papers focused on Vision-Language-Action (VLA) models, essential for researchers and practitioners in embodied AI and robotics. It systematically organizes seminal and recent works, offering a structured overview of advancements in VLA research, its foundational components, and diverse applications.
How It Works
The collection is meticulously organized by distinct approaches to integrating vision, language, and action modalities, primarily categorized by how actions are tokenized and represented. Key sections include "Language Description as Action Tokens," "Code as Action Tokens," "Affordance as Action Tokens," and "Reasoning as Action Tokens." It also details foundational language and vision models, specific VLA architectures, and related survey papers, with each entry providing direct links to publications, code repositories, pre-trained models, and official websites for in-depth exploration.
Highlighted Details
Maintenance & Community
The README does not specify maintenance details or community channels for this repository itself. The individual research papers listed may have their own associated communities and development efforts.
Licensing & Compatibility
No licensing information is provided for this repository or the curated collection of research papers.
Limitations & Caveats
This repository is purely an informational resource, functioning as a curated bibliography of research papers. It does not provide any executable code, models, or direct tools for implementation, serving solely as a reference guide for the VLA research landscape.
4 months ago
Inactive
microsoft
NVIDIA