Suite of projects for autonomous driving foundation models
Top 47.6% on sourcepulse
This repository provides a comprehensive suite of foundation models and datasets for advancing autonomous driving systems, targeting researchers and engineers in the field. It offers generalized predictive models, large-scale video datasets, and benchmarks for language-driven driving, aiming to accelerate the integration of large foundation models into autonomous driving.
How It Works
The project leverages large-scale datasets like OpenDV-YouTube (1700+ hours) and introduces models such as GenAD for generalized prediction and Vista for world modeling. These components are designed to enable high-fidelity, long-horizon future prediction and multi-modal action execution, addressing the need for robust and generalizable driving intelligence.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is actively updated, with recent releases including a mini subset of OpenDV-YouTube and the Vista model. Related challenges and leaderboards are hosted on opendrivelab.com.
Licensing & Compatibility
The README mentions "YouTube license" for OpenDV videos, implying potential restrictions on commercial use or redistribution. Specific software licenses for the models and code are not explicitly detailed in the provided text.
Limitations & Caveats
The OpenDV dataset is massive, requiring significant storage (up to 24TB for processed images), making it challenging to work with the full dataset without substantial resources. The licensing for the video data may impose restrictions.
1 month ago
1 day