commavq by commaai

Dataset of compressed driving video for world models

Created 2 years ago

340 stars

Top 81.2% on SourcePulse

Project Summary

This repository provides commaVQ, a large-scale dataset of compressed driving videos and associated models for world modeling and video prediction. It targets researchers and engineers in autonomous driving and AI, offering a foundation for developing predictive world models by leveraging a novel VQ-VAE compression technique.

How It Works

The project utilizes a VQ-VAE to compress raw driving video frames into discrete 10-bit tokens. A Generative Pre-trained Transformer (GPT) world model is then trained on millions of minutes of these compressed video segments to predict future states. This approach allows for efficient representation and prediction of complex driving scenarios, enabling the development of more sample-efficient world models.

Quick Start & Requirements

Install via Hugging Face datasets: pip install datasets
Load dataset: from datasets import load_dataset; ds = load_dataset('commaai/commavq', num_proc=40)
Requires Python and significant disk space for the dataset.
Notebooks for encoding, decoding, and GPT usage are available: notebooks/

Highlighted Details

Dataset comprises 100,000 minutes of compressed driving video segments.
World model trained on 3,000,000 minutes of driving data.
Offers a lossless compression challenge with a $500 prize for the highest compression rate.
Leaderboard showcases top compression solutions, with methods like arithmetic coding with GPT achieving high scores.

Maintenance & Community

Developed by comma.ai.
Community engagement via Discord: Discord
Job opportunities listed: comma.ai/jobs

Licensing & Compatibility

The repository itself does not explicitly state a license. The dataset is hosted on Hugging Face, which typically uses its own dataset terms of service.

Limitations & Caveats

The project's primary focus is on compressed video data and world modeling; it does not provide end-to-end self-driving solutions or pre-trained models for direct deployment in vehicles. The compression challenge has concluded.

commavq by commaai

Explore Similar Projects

MiraData by mira-space

VideoChat-Flash by OpenGVLab

DriveDreamer by JeffWang987

Awesome-Deep-Learning-Based-Video-Compression by ppingzhang

DriveAGI by OpenDriveLab

MovieChat by rese1f

Allegro by rhymes-ai

tinyworlds by AlmondGod

VideoGPT by wilson1yan

Step-Video-T2V by stepfun-ai

SkyReels-V2 by SkyworkAI

research by commaai