Prompt-Segment-Anything  by RockeyCoss

Zero-shot instance segmentation using Segment Anything (SAM)

created 2 years ago
311 stars

Top 87.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an implementation for zero-shot instance segmentation by leveraging the Segment Anything (SAM) model. It targets researchers and practitioners in computer vision who need to perform instance segmentation without task-specific training data, offering improved performance by combining SAM with powerful object detectors.

How It Works

The project integrates SAM with various object detection backbones (e.g., Swin-L, FocalNet-L) and detectors (e.g., H-Deformable-DETR, DINO). The detectors generate bounding boxes, which are then used as prompts for SAM to produce instance masks. This approach benefits from the zero-shot capabilities of SAM while enhancing segmentation accuracy through the precise localization provided by state-of-the-art detectors. It also supports advanced prompting techniques like cascade prompts (box + mask) and multimask output for more refined results.

Quick Start & Requirements

  • Installation: Clone the repository, install PyTorch, MMCV (pip install -U openmim mim install "mmcv-full<2.0.0"), MMDetection requirements (pip install -r requirements.txt), compile CUDA operators (cd projects/instance_segment_anything/ops && python setup.py build install), and set PYTHONPATH=$(pwd).
  • Prerequisites: Python 3.7.10, PyTorch 1.10.2, CUDA 10.2 (tested versions). Requires SAM checkpoints (ViT-B, L, H) and detection model checkpoints.
  • Demo: A Gradio demo is available via pip install gradio and python app.py.
  • Documentation: Configuration files are linked for evaluation and visualization.

Highlighted Details

  • Achieves strong COCO instance segmentation results, with mask AP up to 49.1 (FocalNet-L+DINO+SAM-ViT-H).
  • Supports multimask output and cascade prompt modes for enhanced segmentation.
  • Integrates with MMDetection, H-Deformable-DETR, and FocalNet-DINO.
  • Offers a HuggingFace Gradio demo for easy visualization.

Maintenance & Community

The project cites foundational works like Segment Anything, H-Deformable-DETR, Swin Transformer, DINO, and FocalNet. No specific community channels (Discord/Slack) or active maintenance signals are mentioned in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it relies on components with their own licenses (e.g., Segment Anything, MMDetection). Users should verify compatibility for commercial use.

Limitations & Caveats

The tested environment specifies older versions of PyTorch and CUDA, suggesting potential compatibility issues with newer setups. The README does not detail specific limitations or known bugs.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.