PonderV2  by OpenGVLab

3D pre-training framework for efficient 3D representations

Created 1 year ago
359 stars

Top 77.9% on SourcePulse

GitHubView on GitHub
Project Summary

PonderV2 is a universal pre-training framework for 3D foundation models, enabling efficient learning of 3D representations through differentiable neural rendering. It targets researchers and practitioners in 3D computer vision, offering a unified approach to bridge 2D and 3D data modalities.

How It Works

PonderV2 leverages differentiable neural rendering as a core mechanism for pre-training on point clouds. This approach allows the model to learn rich 3D representations by effectively bridging the gap between 2D and 3D data, facilitating a universal pre-training paradigm.

Quick Start & Requirements

  • Installation: Requires Ubuntu 18.04+, CUDA 11.3+, PyTorch 1.10.0+. Installation involves creating a conda environment and installing dependencies including pytorch-cluster, pytorch-scatter, pytorch-sparse, spconv, torch-geometric, opencv-python, open3d, and CLIP. Specific CUDA architecture compilation is needed for spconv and NeuS renderer.
  • Data Preparation: Detailed instructions are available in docs/data_preparation.md.
  • Pre-trained Models: Checkpoints are available via docs/model_zoo.md.
  • Documentation: Comprehensive guides for installation and usage are in docs/getting_started.md.

Highlighted Details

  • Achieves state-of-the-art results on ScanNet and S3DIS for semantic segmentation.
  • Supports multi-dataset training and Point Prompt Training (PPT).
  • Includes implementations for downstream tasks like semantic and instance segmentation.
  • Offers pre-trained checkpoints for indoor and outdoor datasets.

Maintenance & Community

The project is associated with Shanghai AI Lab, HKU, ZJU, and USTC. Updates include bug fixes for indoor pre-training and Structured3D data preprocessing. Further details on the roadmap or community channels are not explicitly provided in the README.

Licensing & Compatibility

The project is released under a permissive license, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The README notes that indoor pre-training configurations had bugs prior to a specific commit, requiring users to ensure they are using the fixed versions. Similarly, Structured3D RGB-D data preprocessing had bugs, necessitating regeneration of processed data if older versions were used.

Health Check
Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

pytorch3d by facebookresearch

0.2%
10k
PyTorch3D is a PyTorch library for 3D deep learning research
Created 5 years ago
Updated 3 days ago
Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
6 more.

3D-Machine-Learning by timzhang642

0.1%
10k
Resource list for 3D machine learning
Created 8 years ago
Updated 1 year ago
Feedback? Help us improve.