SKU110K_CVPR19  by eg4000

Object detector research paper for densely packed scenes

Created 6 years ago
817 stars

Top 43.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a dataset and codebase for precise object detection in densely packed scenes, targeting researchers and practitioners in computer vision. It addresses the challenge of accurately identifying and localizing numerous, often similar, objects in close proximity, a common issue in retail and urban environments.

How It Works

The core innovation lies in a novel Soft-IoU layer integrated into an object detector. This layer estimates the Jaccard index between detected bounding boxes and ground truth, providing a quality score. These detections, along with their Soft-IoU scores, are then modeled as a Mixture of Gaussians. An Expectation-Maximization (EM) based merger unit clusters these Gaussians to resolve overlapping detections, leading to more precise results in crowded scenes.

Quick Start & Requirements

  • Install: Requires Keras 2.2.4+, TensorFlow-GPU, Keras-resnet, six, scipy, Pillow, pandas, and tqdm. Tested with Python 3.6.5 and OpenCV 3.1.
  • Dataset: SKU-110K dataset (110k categories) available for academic and non-commercial use.
  • Pretrained Model: Available for download.
  • Usage: Detailed training and prediction commands are provided using keras-retinanet as a base.

Highlighted Details

  • Introduces the SKU-110K dataset, specifically designed for densely packed scenes.
  • Implements a Soft-IoU layer for improved detection quality estimation.
  • Features an EM-Merger unit to resolve overlapping detections.
  • Built upon the keras-retinanet framework.

Maintenance & Community

The project is associated with the CVPR 2019 paper "Precise Detection in Densely Packed Scenes." Contributions are welcomed.

Licensing & Compatibility

The dataset is provided for academic and non-commercial use only. The codebase license is not explicitly stated but is built on keras-retinanet.

Limitations & Caveats

The codebase is noted as being under testing with potential glitches. The EM-merger is a stable but not time-optimized version. The dataset license restricts commercial use.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Deshraj Yadav Deshraj Yadav(Cofounder of Mem0), and
7 more.

rcnn by rbgirshick

0.2%
2k
Object detection system using CNNs and region proposals
Created 11 years ago
Updated 8 years ago
Feedback? Help us improve.