Survey paper on multimodal LLMs for autonomous driving
Top 91.9% on sourcepulse
This repository provides a comprehensive survey of Multimodal Large Language Models (MLLMs) applied to autonomous driving. It serves as a valuable resource for researchers and engineers exploring the integration of LLMs into vehicle perception, planning, and control systems, offering a structured overview of current advancements, datasets, and future research directions.
How It Works
The survey systematically investigates the application of MLLMs in autonomous driving by first introducing the foundational concepts of MLLMs and the history of autonomous driving. It then provides an extensive overview of existing MLLM tools, datasets, and benchmarks relevant to driving, transportation, and map systems. The work also summarizes contributions from the WACV 2024 LLVM-AD workshop, highlighting key challenges and opportunities.
Quick Start & Requirements
This repository is a survey and does not involve direct code execution or installation. It links to the WACV 2024 Proceedings, an Arxiv version of the paper, and details about the LLVM-AD workshop.
Highlighted Details
Maintenance & Community
The repository is actively updated with new references, including recent contributions from CVPR 2024 (MAPLM, LaMPilot). It also references the successful organization of the LLVM-AD Workshop at WACV 2024.
Licensing & Compatibility
The repository itself does not specify a license. The survey paper is cited using a standard academic citation format.
Limitations & Caveats
As a survey, this repository does not provide executable code or benchmarks. The rapidly evolving nature of MLLMs means that the information may require continuous updates to remain fully current.
1 year ago
1 week