Open-GroundingDino  by longzw1997

Third-party GroundingDINO implementation for open-set object detection

created 1 year ago
656 stars

Top 51.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a third-party implementation of the Grounding DINO paper, enabling open-set object detection and grounding. It is designed for researchers and practitioners looking to fine-tune or pre-train models for custom datasets, offering capabilities beyond the official release.

How It Works

The project leverages the DINO architecture, enhanced with grounded pre-training. It supports a custom odvg data format for training, which unifies object detection (OD) and visual grounding (VG) datasets. This approach allows for flexible data integration and training on diverse datasets, including large-scale ones like GRIT-20M.

Quick Start & Requirements

  • Install: Clone the repository, install dependencies via pip install -r requirements.txt, and compile custom ops (cd models/GroundingDINO/ops && python setup.py build install).
  • Prerequisites: Python 3.7.11, PyTorch 1.11.0, CUDA 11.3.
  • Resources: Requires downloading pre-trained models and BERT weights.
  • Docs: Dataset conversion details are in data_format.md.

Highlighted Details

  • Supports training on both object detection and grounding datasets.
  • Implements Slurm multi-machine support and training acceleration strategies.
  • Offers inference capabilities with provided pre-trained models.
  • Allows fine-tuning and pre-training from scratch.

Maintenance & Community

  • The project is actively maintained by its authors, Zuwei Long and Wei Li.
  • Contact information for suggestions and bug reports is provided. Contributions via pull requests are welcomed.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is a third-party implementation and may not perfectly replicate all features or performance of the official Grounding DINO. Training support for object detection data was initially marked as '✖' in the README's feature table, though the text indicates training is supported. Evaluation on custom test sets requires careful configuration of use_coco_eval and label_list.

Health Check
Last commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
2
Star History
76 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.