research  by commaai

Dataset and code for driving simulator research paper

created 9 years ago
4,126 stars

Top 12.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the dataset and code for the 2016 paper "Learning a Driving Simulator" by comma.ai. It offers 7.25 hours of driving footage and associated sensor data, suitable for training machine learning models for tasks like steering angle prediction and generative image modeling in autonomous driving research.

How It Works

The project utilizes a large dataset comprising video clips recorded at 20 Hz and synchronized sensor measurements (speed, acceleration, steering angle, GPS, gyroscope) transformed to a uniform 100 Hz time base. The data is stored in HDF5 format, with camera frames having a shape of number_frames x 3 x 160 x 320 (uint8). The cam1_ptr HDF5 dataset specifically addresses the alignment between camera frames and other sensor logs.

Quick Start & Requirements

  • Install/Run: Download dataset via ./get_data.sh or archive.org.
  • Prerequisites: Anaconda, TensorFlow 0.9, Keras 1.0.6, OpenCV (cv2).
  • Dataset Size: 45 GB compressed, 80 GB uncompressed.
  • Documentation: archive.org

Highlighted Details

  • Dataset contains 7.25 hours of driving data from a 2016 Acura ILX.
  • Includes code for two ML experiments: steering angle prediction and generative image modeling.
  • Data is recorded at 20 Hz and synchronized sensor logs are at 100 Hz.
  • Camera images are 160x320 resolution.

Maintenance & Community

  • Credits include Riccardo Biasini, George Hotz, Sam Khalandovsky, Eder Santana, and Niel van der Westhuizen.
  • Hiring call for individuals demonstrating skills with this dataset.

Licensing & Compatibility

  • Dataset is copyrighted by comma.ai and published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
  • Non-commercial use and share-alike restrictions apply.

Limitations & Caveats

The project relies on older versions of TensorFlow (0.9) and Keras (1.0.6), which may present compatibility challenges with modern ML frameworks and hardware. The dataset is substantial in size, requiring significant storage and download time.

Health Check
Last commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.