Meta-DETR  by ZhangGongjie

PyTorch implementation for few-shot object detection research

created 4 years ago
406 stars

Top 71.6% on SourcePulse

GitHubView on GitHub
Project Summary

Meta-DETR is a PyTorch implementation for few-shot object detection, addressing the challenge of generalizing from base classes to novel classes with limited data. It targets researchers and practitioners in computer vision and deep learning who require state-of-the-art performance in few-shot detection scenarios, offering improved generalization by exploiting inter-class correlations.

How It Works

Meta-DETR employs an image-level meta-learning approach, bypassing the proposal quality gap common in R-CNN-based methods. It performs meta-learning across multiple support classes simultaneously, enabling effective exploitation of inter-class correlations to enhance generalization. This approach leads to superior performance compared to traditional few-shot object detectors.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment with Python 3.7, install PyTorch 1.7.1+cu102 and TorchVision 0.8.2+cu102, and compile Deformable Attention operators.
  • Prerequisites: NVIDIA GPUs (tested with 8x V100), CUDA 10.2, Python 3.7, PyTorch 1.7.1, TorchVision 0.8.2, GCC 7.5.0, cython, pycocotools, tqdm, scipy.
  • Data: Requires MS-COCO 2017 or Pascal VOC datasets organized in specific directory structures. Few-shot splits are provided.
  • Links: Paper, Pre-trained Weights (for base training stage).

Highlighted Details

  • Official PyTorch implementation of the T-PAMI 2022 paper.
  • Achieves state-of-the-art performance in few-shot object detection.
  • Exploits inter-class correlations for improved generalization.
  • Bypasses proposal quality gap for better few-shot performance.

Maintenance & Community

The project is associated with authors from academia, including Eric P. Xing. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

Released under the MIT license. Users are responsible for ensuring compliance with all license requirements, including those of prior works. Commercial use is permitted under MIT license terms.

Limitations & Caveats

The implementation is tested on specific older versions of Ubuntu, CUDA, and PyTorch, recommending exact setups. While broader compatibility is suggested, users may encounter setup challenges with different environments. The project relies on custom CUDA operators for Deformable Attention, requiring compilation.

Health Check
Last commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.