Discover and explore top open-source AI tools and projects—updated daily.
Curated list of Vision-Language-Action models for autonomous driving
Top 82.0% on SourcePulse
This repository serves as a curated list of Vision-Language-Action (VLA) models for Autonomous Driving (AD), complementing a survey paper on the topic. It targets researchers and developers in autonomous driving and multimodal AI, providing a structured overview of the evolving VLA4AD landscape, from explanatory perception to end-to-end control.
How It Works
The repository categorizes VLA4AD models into four paradigms: VLM as Explainers, Modular VLA4AD, End-to-End VLA4AD, and Reasoning-Augmented VLA4AD. It details the progression from simple language explanations to complex systems that integrate vision, language, and action for instruction understanding, reasoning, and vehicle control, often leveraging large language models (LLMs) and diffusion models.
Quick Start & Requirements
git clone https://github.com/JohnsonJiang1996/Awesome-VLA4AD.git
followed by cd Awesome-VLA4AD
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
This repository is a curated list of resources and does not contain executable code for VLA4AD models. Users must individually locate, install, and configure the specific models and datasets they wish to use, each with its own set of dependencies and requirements.
2 months ago
Inactive