SKU110K_CVPR19 by eg4000

Object detector research paper for densely packed scenes

Created 6 years ago

826 stars

Top 43.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Aravind Srinivas

Cofounder of Perplexity

Project Summary

This repository provides a dataset and codebase for precise object detection in densely packed scenes, targeting researchers and practitioners in computer vision. It addresses the challenge of accurately identifying and localizing numerous, often similar, objects in close proximity, a common issue in retail and urban environments.

How It Works

The core innovation lies in a novel Soft-IoU layer integrated into an object detector. This layer estimates the Jaccard index between detected bounding boxes and ground truth, providing a quality score. These detections, along with their Soft-IoU scores, are then modeled as a Mixture of Gaussians. An Expectation-Maximization (EM) based merger unit clusters these Gaussians to resolve overlapping detections, leading to more precise results in crowded scenes.

Quick Start & Requirements

Install: Requires Keras 2.2.4+, TensorFlow-GPU, Keras-resnet, six, scipy, Pillow, pandas, and tqdm. Tested with Python 3.6.5 and OpenCV 3.1.
Dataset: SKU-110K dataset (110k categories) available for academic and non-commercial use.
Pretrained Model: Available for download.
Usage: Detailed training and prediction commands are provided using keras-retinanet as a base.

Highlighted Details

Introduces the SKU-110K dataset, specifically designed for densely packed scenes.
Implements a Soft-IoU layer for improved detection quality estimation.
Features an EM-Merger unit to resolve overlapping detections.
Built upon the keras-retinanet framework.

Maintenance & Community

The project is associated with the CVPR 2019 paper "Precise Detection in Densely Packed Scenes." Contributions are welcomed.

Licensing & Compatibility

The dataset is provided for academic and non-commercial use only. The codebase license is not explicitly stated but is built on keras-retinanet.

Limitations & Caveats

The codebase is noted as being under testing with potential glitches. The EM-merger is a stable but not time-optimized version. The dataset license restricts commercial use.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days