Prompt-Segment-Anything by RockeyCoss

Zero-shot instance segmentation using Segment Anything (SAM)

Created 2 years ago

317 stars

Top 85.4% on SourcePulse

Project Summary

This repository provides an implementation for zero-shot instance segmentation by leveraging the Segment Anything (SAM) model. It targets researchers and practitioners in computer vision who need to perform instance segmentation without task-specific training data, offering improved performance by combining SAM with powerful object detectors.

How It Works

The project integrates SAM with various object detection backbones (e.g., Swin-L, FocalNet-L) and detectors (e.g., H-Deformable-DETR, DINO). The detectors generate bounding boxes, which are then used as prompts for SAM to produce instance masks. This approach benefits from the zero-shot capabilities of SAM while enhancing segmentation accuracy through the precise localization provided by state-of-the-art detectors. It also supports advanced prompting techniques like cascade prompts (box + mask) and multimask output for more refined results.

Quick Start & Requirements

Installation: Clone the repository, install PyTorch, MMCV (pip install -U openmim mim install "mmcv-full<2.0.0"), MMDetection requirements (pip install -r requirements.txt), compile CUDA operators (cd projects/instance_segment_anything/ops && python setup.py build install), and set PYTHONPATH=$(pwd).
Prerequisites: Python 3.7.10, PyTorch 1.10.2, CUDA 10.2 (tested versions). Requires SAM checkpoints (ViT-B, L, H) and detection model checkpoints.
Demo: A Gradio demo is available via pip install gradio and python app.py.
Documentation: Configuration files are linked for evaluation and visualization.

Highlighted Details

Achieves strong COCO instance segmentation results, with mask AP up to 49.1 (FocalNet-L+DINO+SAM-ViT-H).
Supports multimask output and cascade prompt modes for enhanced segmentation.
Integrates with MMDetection, H-Deformable-DETR, and FocalNet-DINO.
Offers a HuggingFace Gradio demo for easy visualization.

Maintenance & Community

The project cites foundational works like Segment Anything, H-Deformable-DETR, Swin Transformer, DINO, and FocalNet. No specific community channels (Discord/Slack) or active maintenance signals are mentioned in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it relies on components with their own licenses (e.g., Segment Anything, MMDetection). Users should verify compatibility for commercial use.

Limitations & Caveats

The tested environment specifies older versions of PyTorch and CUDA, suggesting potential compatibility issues with newer setups. The README does not detail specific limitations or known bugs.

Prompt-Segment-Anything by RockeyCoss

Explore Similar Projects

PixelRefer by alibaba-damo-academy

segment-anything-with-clip by Curt-Park

segment-anything-webui by Kingfish404

CLIP-SAM by maxi-w

CIoU by Zzh-tju

ComfyUI-YoloWorld-EfficientSAM by ZHO-ZHO-ZHO

ComfyUI-RMBG by 1038lab

comfyui_segment_anything by storyicon

clipseg by timojl

sd-webui-segment-anything by continue-revolution

FastSAM by CASIA-LMC-Lab

Pytorch-UNet by milesial