SAM2Point  by ZiyuGuo99

3D segmentation via adapting Segment Anything Model (SAM)

Created 1 year ago
342 stars

Top 80.8% on SourcePulse

GitHubView on GitHub
Project Summary

SAM2Point adapts the Segment Anything Model 2 (SAM 2) for zero-shot and promptable 3D segmentation, targeting researchers and practitioners in 3D computer vision. It offers flexibility across various prompt types (points, boxes, masks) and diverse 3D data scenarios, aiming for efficient and generalizable 3D segmentation.

How It Works

The framework leverages SAM 2's architecture to process 3D data, treating it conceptually as multi-directional videos. This approach allows for promptable segmentation using 3D-specific inputs, enabling zero-shot generalization across different object types and scene complexities. The core advantage lies in adapting a powerful 2D segmentation model to the 3D domain with minimal architectural changes, preserving promptability and efficiency.

Quick Start & Requirements

  • Install: Clone the repository, create a conda environment (conda create -n sam2point python=3.10), activate it, and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python >= 3.10, PyTorch >= 2.3.1, TorchVision >= 0.18.1. Download SAM 2 checkpoints and sample 3D data.
  • Setup: Requires downloading checkpoints and sample data.
  • Links: Webpage, HuggingFace Demo, arXiv Report (Note: Actual links are placeholders as they were not provided in the README).

Highlighted Details

  • Claims to be the "most faithful implementation of SAM in 3D."
  • Demonstrates superior implementation efficiency, promptable flexibility, and generalization capabilities.
  • Generates multi-directional videos of the segmentation process.
  • Supports 3D points, boxes, and masks as prompts.

Maintenance & Community

The project is associated with authors from the arXiv paper "SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners." Further research links are provided in related work.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Code for custom 3D input and prompts will be released soon, indicating current limitations in user-defined input flexibility. The project is described as a "preliminary exploration."

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

pytorch3d by facebookresearch

0.2%
10k
PyTorch3D is a PyTorch library for 3D deep learning research
Created 5 years ago
Updated 3 days ago
Feedback? Help us improve.