CVPR2023-3D-Occupancy-Prediction  by CVPR2023-3D-Occupancy-Prediction

3D occupancy prediction benchmark for autonomous driving scene perception

created 2 years ago
841 stars

Top 43.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository hosts the CVPR 2023 3D Occupancy Prediction Challenge, providing a benchmark for autonomous driving scene perception. It addresses the limitations of traditional 3D bounding box detection by enabling dense, voxel-wise prediction of scene occupancy and semantics from surround-view images. The target audience includes researchers and engineers in autonomous driving and computer vision.

How It Works

The challenge focuses on predicting the occupancy state (free or occupied) and semantic class for each voxel in a 3D scene, using only camera images as input. This approach allows for a more detailed representation of the environment compared to bounding boxes, capturing complex object shapes and background elements. The benchmark utilizes a voxelized representation derived from the nuScenes dataset, requiring models to perform dense 3D prediction.

Quick Start & Requirements

  • Baseline: A baseline model based on BEVFormer is provided. Refer to getting_started for details.
  • Data: The dataset is based on nuScenes, with mini (440MB), trainval (32GB), and test (6GB) splits available for download.
  • Submission: Results are submitted via an evaluation server, requiring a specific .npz format for each frame.

Highlighted Details

  • Benchmark: The first large-scale 3D occupancy benchmark for autonomous driving.
  • Data: Voxelized representation with occupancy state and semantics, derived from nuScenes.
  • Evaluation: Primarily ranked by mean Intersection over Union (mIoU).
  • Input: Camera images only; no future frames allowed during inference.

Maintenance & Community

Licensing & Compatibility

  • Dataset: Subject to nuScenes dataset terms of use.
  • Code: MIT License.

Limitations & Caveats

The nuScenes dataset has known issues with z-axis translation, potentially affecting precise 6D localization and point cloud accumulation. Some data exhibits ground stratification. The evaluation uses a mask_camera to exclude voxels not visible to cameras.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.