SAM2Point  by ZiyuGuo99

3D segmentation via adapting Segment Anything Model (SAM)

created 11 months ago
332 stars

Top 83.7% on sourcepulse

GitHubView on GitHub
Project Summary

SAM2Point adapts the Segment Anything Model 2 (SAM 2) for zero-shot and promptable 3D segmentation, targeting researchers and practitioners in 3D computer vision. It offers flexibility across various prompt types (points, boxes, masks) and diverse 3D data scenarios, aiming for efficient and generalizable 3D segmentation.

How It Works

The framework leverages SAM 2's architecture to process 3D data, treating it conceptually as multi-directional videos. This approach allows for promptable segmentation using 3D-specific inputs, enabling zero-shot generalization across different object types and scene complexities. The core advantage lies in adapting a powerful 2D segmentation model to the 3D domain with minimal architectural changes, preserving promptability and efficiency.

Quick Start & Requirements

  • Install: Clone the repository, create a conda environment (conda create -n sam2point python=3.10), activate it, and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python >= 3.10, PyTorch >= 2.3.1, TorchVision >= 0.18.1. Download SAM 2 checkpoints and sample 3D data.
  • Setup: Requires downloading checkpoints and sample data.
  • Links: Webpage, HuggingFace Demo, arXiv Report (Note: Actual links are placeholders as they were not provided in the README).

Highlighted Details

  • Claims to be the "most faithful implementation of SAM in 3D."
  • Demonstrates superior implementation efficiency, promptable flexibility, and generalization capabilities.
  • Generates multi-directional videos of the segmentation process.
  • Supports 3D points, boxes, and masks as prompts.

Maintenance & Community

The project is associated with authors from the arXiv paper "SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners." Further research links are provided in related work.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Code for custom 3D input and prompts will be released soon, indicating current limitations in user-defined input flexibility. The project is described as a "preliminary exploration."

Health Check
Last commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
15 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.