Drive LLM-Agent framework for end-to-end autonomous driving
Top 67.4% on sourcepulse
OmniDrive is a comprehensive framework for end-to-end autonomous driving, leveraging a novel 3D multimodal Large Language Model (LLM) agent. It targets researchers and developers in autonomous driving, offering advanced capabilities for perception, reasoning, and planning, with a focus on interactive conversation and counterfactual analysis of driving scenarios.
How It Works
The core innovation lies in the OmniDrive-Agent, a 3D multimodal LLM that utilizes sparse queries to efficiently lift and compress visual representations into a 3D space. This approach enables sophisticated reasoning and planning by processing rich 3D scene information, facilitating tasks like scene description, traffic regulation adherence, 3D object grounding, and counterfactual reasoning about driving decisions.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is associated with NVlabs and has been accepted to CVPR 2025. TensorRT support was recently added with assistance from the NVIDIA TSE Team.
Licensing & Compatibility
The license is not explicitly stated in the provided README snippet.
Limitations & Caveats
The README does not detail specific limitations, unsupported platforms, or known bugs. The project appears to be in active development, with recent additions like TensorRT support.
1 month ago
1 day