rcnn by rbgirshick

Object detection system using CNNs and region proposals

Created 12 years ago

2,413 stars

Top 18.8% on SourcePulse

View on GitHub

9 Experts Love This Project

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

Deshraj Yadav

Cofounder of Mem0

Vaibhav Nivargi

Cofounder of Moveworks

Jianwei Yang

Research Scientist at Meta Superintelligence Lab

and 5 more!

Project Summary

This repository provides the original R-CNN (Regions with Convolutional Neural Network Features) codebase, a pioneering object detection system. It targets researchers and practitioners interested in the historical development of deep learning for computer vision, offering a foundational understanding of region-based object detection.

How It Works

R-CNN combines a bottom-up region proposal method (Selective Search) with features extracted by a deep Convolutional Neural Network (CNN). Proposed regions are warped and fed into a CNN to generate rich feature vectors. These features are then used to train category-specific SVMs for classification and a linear regression model for bounding box refinement, achieving state-of-the-art performance at the time of its release.

Quick Start & Requirements

Install: Requires MATLAB and Caffe (specifically version v0.999).
Prerequisites: MATLAB (tested with 2012b on 64-bit Linux), Caffe v0.999 with MATLAB wrapper compiled (make matcaffe), ImageNet auxiliary data (./get_ilsvrc_aux.sh), and the R-CNN source code. CUDA and MKL libraries must be in LD_LIBRARY_PATH.
Setup: Involves cloning the repo, creating a symlink to Caffe, downloading pre-computed models (1.5GB), and potentially pre-computed selective search boxes.
Demo: Run rcnn_demo within MATLAB after setup.
Docs: https://github.com/rbgirshick/rcnn

Highlighted Details

Achieved 53.3% mAP on PASCAL VOC 2012, a 30% relative improvement over prior methods.
Supports training custom detectors on datasets with bounding box annotations.
Includes code for fine-tuning CNNs for detection using Caffe.
Models available for PASCAL VOC 2007, 2010, 2012, and ILSVRC13.

Maintenance & Community

This codebase is explicitly stated as "no longer maintained and exists as a historical artifact." It is a supplement to academic papers.

Licensing & Compatibility

Released under the Simplified BSD License. Compatible with commercial use.

Limitations & Caveats

The code is outdated and not compatible with current Caffe versions. It requires specific, older dependencies (Caffe v0.999) and MATLAB, making setup complex and potentially fragile. The README notes that training requires significant disk space (200GB for feature cache) and considerable time (8-9 hours per chunk on a powerful GPU/CPU setup).

Health Check

Last Commit

8 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days