hand_object_detector by ddshan

Hand object detector for understanding human hands in contact

Created 5 years ago

282 stars

Top 92.6% on SourcePulse

Project Summary

This repository provides a PyTorch implementation of a Faster R-CNN-based hand object detector, addressing the challenge of understanding human hands in contact with objects from internet-scale data. It is primarily for researchers and developers working on human-computer interaction, robotics, or activity recognition who need to detect and classify hands and their interactions with objects.

How It Works

The detector is built upon the Faster R-CNN architecture, a well-established object detection framework. It leverages a ResNet-101 backbone for feature extraction. The implementation includes specific adaptations for detecting hands and objects, with capabilities to classify hand states (e.g., self-contact, contact with other people or objects) and potentially infer hand side (left/right).

Quick Start & Requirements

Installation: Clone the repository, create a conda environment (handobj_new), and install PyTorch 1.12.1 with CUDA 11.3. Compile CUDA dependencies by running python setup.py build develop in the lib directory. Install Python dependencies via pip install -r requirements.txt.
Prerequisites: Python 3.8, PyTorch 1.12.1, CUDA 11.3.
Data: Requires data in Pascal VOC format. Pre-prepared data and pre-trained ResNet-101 models are available for download.
Links: Project and dataset webpage: https://github.com/ddshan/hand_object_detector

Highlighted Details

Achieves high AP scores on custom datasets (e.g., 90.4 for Hand Object detection on handobj_100K+ego).
Offers models trained on different dataset combinations (handobj_100K, handobj_100K+ego) for varying performance characteristics.
Provides detailed output formats including bounding boxes, confidence scores, hand states, and offset vectors.
Includes a matching.py script for post-processing detection results.

Maintenance & Community

The project is associated with CVPR 2020 (Oral) and lists Dandan Shan, Jiaqi Geng, Michelle Shu, and David F. Fouhey as contributors. No specific community channels or active maintenance indicators are present in the README.

Licensing & Compatibility

The repository does not explicitly state a license. However, it is based on faster-rcnn.pytorch (using branch pytorch-1.0), which may have its own licensing terms. Commercial use compatibility is not specified.

Limitations & Caveats

The project notes occasional false positives with no people present, difficulties with left/right hand classification in egocentric data, and challenges in parsing full states with multiple people. The egocentric models are noted to perform significantly better for egocentric data.

hand_object_detector by ddshan

Explore Similar Projects

arctic by zc-alexfan

efficientdet-pytorch by bubbliiiing

keras_cv_attention_models by leondgarse

MS-G3D by kenziyuliu

Otter by EvolvingLMMs-Lab

ailia-models by ailia-ai

EfficientDet by xuannianz

dl-colab-notebooks by tugstugi

ddddocr by sml2h3

ImageAI by OlafenwaMoses

jetson-inference by dusty-nv

yolov3 by ultralytics