Object detection via grounded pre-training research paper
Top 6.0% on sourcepulse
Grounding DINO is an open-source PyTorch implementation for open-set object detection, enabling users to detect any object specified by natural language prompts. It is designed for researchers and developers working on advanced computer vision tasks, offering high performance and flexibility for applications like image editing and automated annotation.
How It Works
Grounding DINO integrates the DINO (DETR with Improved deNoising Anchor) object detection framework with grounded pre-training. This approach allows it to understand and localize objects based on textual descriptions, achieving strong zero-shot performance by leveraging a text backbone, image backbone, feature enhancer, language-guided query selection, and a cross-modality decoder.
Quick Start & Requirements
pip install -e .
within the cloned repository.CUDA_HOME
is set correctly if using CUDA.groundingdino_swint_ogc.pth
from the releases page.CUDA_VISIBLE_DEVICES={GPU ID} python demo/inference_on_a_image.py -c groundingdino/config/GroundingDINO_SwinT_OGC.py -p weights/groundingdino_swint_ogc.pth -i image_you_want_to_detect.jpg -o "output_dir" -t "your text prompt"
.Highlighted Details
Maintenance & Community
The project is actively maintained by IDEA-Research and IDEA-CVR. Related projects like Grounded-SAM and Semantic-SAM are also available. Community support channels are not explicitly listed, but the project is associated with the authors' research group.
Licensing & Compatibility
The repository does not explicitly state a license in the README. However, it is common for research implementations to be for non-commercial use unless otherwise specified. Compatibility for commercial use or closed-source linking should be verified.
Limitations & Caveats
Training code is not yet released. The README notes potential NameError: name '_C' is not defined
if installation steps are not followed strictly, requiring re-cloning and reinstallation. The COCO zero-shot evaluation result mentioned in the README (48.5) differs from the claimed benchmark (52.5 AP).
11 months ago
1 day