PonderV2 by OpenGVLab

3D pre-training framework for efficient 3D representations

Created 2 years ago

369 stars

Top 76.5% on SourcePulse

Project Summary

PonderV2 is a universal pre-training framework for 3D foundation models, enabling efficient learning of 3D representations through differentiable neural rendering. It targets researchers and practitioners in 3D computer vision, offering a unified approach to bridge 2D and 3D data modalities.

How It Works

PonderV2 leverages differentiable neural rendering as a core mechanism for pre-training on point clouds. This approach allows the model to learn rich 3D representations by effectively bridging the gap between 2D and 3D data, facilitating a universal pre-training paradigm.

Quick Start & Requirements

Installation: Requires Ubuntu 18.04+, CUDA 11.3+, PyTorch 1.10.0+. Installation involves creating a conda environment and installing dependencies including pytorch-cluster, pytorch-scatter, pytorch-sparse, spconv, torch-geometric, opencv-python, open3d, and CLIP. Specific CUDA architecture compilation is needed for spconv and NeuS renderer.
Data Preparation: Detailed instructions are available in docs/data_preparation.md.
Pre-trained Models: Checkpoints are available via docs/model_zoo.md.
Documentation: Comprehensive guides for installation and usage are in docs/getting_started.md.

Highlighted Details

Achieves state-of-the-art results on ScanNet and S3DIS for semantic segmentation.
Supports multi-dataset training and Point Prompt Training (PPT).
Includes implementations for downstream tasks like semantic and instance segmentation.
Offers pre-trained checkpoints for indoor and outdoor datasets.

Maintenance & Community

The project is associated with Shanghai AI Lab, HKU, ZJU, and USTC. Updates include bug fixes for indoor pre-training and Structured3D data preprocessing. Further details on the roadmap or community channels are not explicitly provided in the README.

Licensing & Compatibility

The project is released under a permissive license, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The README notes that indoor pre-training configurations had bugs prior to a specific commit, requiring users to ensure they are using the fixed versions. Similarly, Structured3D RGB-D data preprocessing had bugs, necessitating regeneration of processed data if older versions were used.

PonderV2 by OpenGVLab

Explore Similar Projects

SceneVerse by scene-verse

SAM2Point by ZiyuGuo99

Segment-Any-Point-Cloud by youquanl

openscene by pengsongyou

Point-BERT by Julie-tang00

FFB6D by ethnhe

NSVF by facebookresearch

3D-Shape-Analysis-Paper-List by yinyunie

Make-It-3D by junshutang

Pointcept by Pointcept

pytorch3d by facebookresearch

3D-Machine-Learning by timzhang642