hand_object_detector  by ddshan

Hand object detector for understanding human hands in contact

created 5 years ago
272 stars

Top 95.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of a Faster R-CNN-based hand object detector, addressing the challenge of understanding human hands in contact with objects from internet-scale data. It is primarily for researchers and developers working on human-computer interaction, robotics, or activity recognition who need to detect and classify hands and their interactions with objects.

How It Works

The detector is built upon the Faster R-CNN architecture, a well-established object detection framework. It leverages a ResNet-101 backbone for feature extraction. The implementation includes specific adaptations for detecting hands and objects, with capabilities to classify hand states (e.g., self-contact, contact with other people or objects) and potentially infer hand side (left/right).

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (handobj_new), and install PyTorch 1.12.1 with CUDA 11.3. Compile CUDA dependencies by running python setup.py build develop in the lib directory. Install Python dependencies via pip install -r requirements.txt.
  • Prerequisites: Python 3.8, PyTorch 1.12.1, CUDA 11.3.
  • Data: Requires data in Pascal VOC format. Pre-prepared data and pre-trained ResNet-101 models are available for download.
  • Links: Project and dataset webpage: https://github.com/ddshan/hand_object_detector

Highlighted Details

  • Achieves high AP scores on custom datasets (e.g., 90.4 for Hand Object detection on handobj_100K+ego).
  • Offers models trained on different dataset combinations (handobj_100K, handobj_100K+ego) for varying performance characteristics.
  • Provides detailed output formats including bounding boxes, confidence scores, hand states, and offset vectors.
  • Includes a matching.py script for post-processing detection results.

Maintenance & Community

The project is associated with CVPR 2020 (Oral) and lists Dandan Shan, Jiaqi Geng, Michelle Shu, and David F. Fouhey as contributors. No specific community channels or active maintenance indicators are present in the README.

Licensing & Compatibility

The repository does not explicitly state a license. However, it is based on faster-rcnn.pytorch (using branch pytorch-1.0), which may have its own licensing terms. Commercial use compatibility is not specified.

Limitations & Caveats

The project notes occasional false positives with no people present, difficulties with left/right hand classification in egocentric data, and challenges in parsing full states with multiple people. The egocentric models are noted to perform significantly better for egocentric data.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.