Grounded-SAM-2  by IDEA-Research

Video object tracker using open-world models

created 1 year ago
2,533 stars

Top 18.9% on sourcepulse

GitHubView on GitHub
Project Summary

Grounded SAM 2 provides a pipeline for grounding and tracking any object in videos using state-of-the-art open-world models like Grounding DINO, Florence-2, and SAM 2. It is designed for researchers and developers working on advanced video analysis, object detection, and segmentation tasks, offering simplified implementations for complex visual tasks.

How It Works

This project builds upon the concept of assembling open-world models, similar to its predecessor Grounded SAM. It leverages the capabilities of models like Grounding DINO (including versions 1.5, 1.6, and DINO-X) for object detection and Florence-2 for various vision tasks, all integrated with SAM 2 for segmentation and tracking. This modular approach allows for flexible and powerful visual task execution, with a focus on simplifying the user experience.

Quick Start & Requirements

  • Installation: pip install -e . for Grounded SAM 2, pip install --no-build-isolation -e grounding_dino for Grounding DINO. Docker installation is also supported via make build-image and make run.
  • Prerequisites: Python 3.10, PyTorch >= 2.3.1, torchvision >= 0.18.1, CUDA-12.1. For Grounding DINO 1.5/1.6 and DINO-X, pip install dds-cloudapi-sdk --upgrade and an API token are required.
  • Setup: Requires downloading pretrained checkpoints for SAM 2 and Grounding DINO.
  • Demos: Numerous demos are available for image and video tasks, including HuggingFace and local model inference. Grounded SAM 2 Demos

Highlighted Details

  • Supports Grounding DINO 1.5/1.6 and DINO-X for enhanced open-set detection.
  • Integrates Florence-2 for diverse tasks like dense region captioning and auto-labeling.
  • Offers SAHI (Slicing Aided Hyper Inference) for high-resolution images with dense objects.
  • Provides robust video object tracking with various prompt types (point, box, mask) and custom video inputs.

Maintenance & Community

The project is actively updated, with recent changes including API updates for Grounding DINO 1.5/1.6 and DINO-X, and support for SAM-2.1.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README. However, it cites research papers that may have their own licensing terms. Compatibility for commercial use is not specified.

Limitations & Caveats

The "Continuous ID" tracking feature is noted as still under development and not entirely stable. Some models (Grounding DINO 1.5/1.6, DINO-X) require an API token from the official website.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
490 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.