DriveAGI by OpenDriveLab

Suite of projects for autonomous driving foundation models

Created 3 years ago

800 stars

Top 43.3% on SourcePulse

Project Summary

This repository provides a comprehensive suite of foundation models and datasets for advancing autonomous driving systems, targeting researchers and engineers in the field. It offers generalized predictive models, large-scale video datasets, and benchmarks for language-driven driving, aiming to accelerate the integration of large foundation models into autonomous driving.

How It Works

The project leverages large-scale datasets like OpenDV-YouTube (1700+ hours) and introduces models such as GenAD for generalized prediction and Vista for world modeling. These components are designed to enable high-fidelity, long-horizon future prediction and multi-modal action execution, addressing the need for robust and generalizable driving intelligence.

Quick Start & Requirements

OpenDV-YouTube Dataset: Raw videos are ~3TB, processed images ~24TB. A mini subset (44GB raw, 390GB processed) is available.
Vista: Code and models available at https://github.com/OpenDriveLab/Vista. Demo available at https://vista-demo.github.io.
GenAD: Paper available at https://arxiv.org/abs/2403.09630.
DriveLM: Repo available at https://github.com/OpenDriveLab/DriveLM.
OpenScene: Repo available at https://github.com/OpenDriveLab/OpenScene.
OpenLane-V2: Repo available at https://github.com/OpenDriveLab/OpenLane-V2.

Highlighted Details

GenAD: CVPR 2024 Highlight, utilizing the OpenDV dataset (1700+ hours), which is 300x larger than nuScenes.
Vista: NeurIPS 2024 paper, presenting the first generalizable driving world model capable of high-fidelity, long-horizon prediction.
OpenDV-YouTube: A large-scale dataset covering 244 cities in 40 countries, with a mini subset for easier experimentation.
DriveLM: Introduces language prompts for driving tasks, with benchmarks based on nuScenes and CARLA.

Maintenance & Community

The project is actively updated, with recent releases including a mini subset of OpenDV-YouTube and the Vista model. Related challenges and leaderboards are hosted on opendrivelab.com.

Licensing & Compatibility

The README mentions "YouTube license" for OpenDV videos, implying potential restrictions on commercial use or redistribution. Specific software licenses for the models and code are not explicitly detailed in the provided text.

Limitations & Caveats

The OpenDV dataset is massive, requiring significant storage (up to 24TB for processed images), making it challenging to work with the full dataset without substantial resources. The licensing for the video data may impose restrictions.

Health Check

Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days