bilinear-cnn by HaoMood

PyTorch implementation of bilinear CNN for fine-grained image recognition

Created 8 years ago

396 stars

Top 72.9% on SourcePulse

Project Summary

This repository provides a PyTorch implementation of Bilinear Convolutional Neural Networks (B-CNN) for fine-grained image recognition. It addresses the need for richer image representations by capturing pairwise descriptor interactions, outperforming part-based models without requiring explicit part annotations. The target audience includes researchers and practitioners in computer vision focused on fine-grained classification tasks.

How It Works

B-CNN employs bilinear pooling to compute the sum of outer products of deep image descriptors. This approach captures all pairwise descriptor interactions in a translation-invariant manner, yielding richer representations than linear models. The implementation allows for fine-tuning either just the fully connected layer or all layers of the network.

Quick Start & Requirements

Install: Requires Python 3.6 with NumPy and PyTorch.
Usage:
- Fine-tune FC layer only: ./src/bilinear_cnn_fc.py --base_lr 1.0 --batch_size 64 --epochs 55 --weight_decay 1e-8
- Fine-tune all layers: ./src/bilinear_cnn_all.py --base_lr 1e-2 --batch_size 64 --epochs 25 --weight_decay 1e-5 --model "model.pth"
Hardware: Supports multi-GPU training (e.g., CUDA_VISIBLE_DEVICES=0,1,2,3).
Links: Original Paper

Highlighted Details

Achieves 76.77% test set accuracy when fine-tuning only the FC layer.
Achieves 84.17% test set accuracy when fine-tuning all layers.
Captures pairwise descriptor interactions for improved representation richness.

Maintenance & Community

The project is relatively old, written for PyTorch 0.3.0.
A faster alternative for newer PyTorch versions is available at HaoMood/blinear-cnn-faster.
Author: Hao Zhang (zhangh0214@gmail.com).

Licensing & Compatibility

License: CC BY-SA 3.0.
This license is a Creative Commons Attribution-ShareAlike license, which requires derivative works to be shared under the same or a compatible license. Commercial use may be restricted depending on how the code is integrated and distributed.

Limitations & Caveats

The repository is written for an outdated PyTorch version (0.3.0), necessitating potential migration efforts for compatibility with modern PyTorch releases.

bilinear-cnn by HaoMood

Explore Similar Projects

Parameter-Efficient-Transfer-Learning-Benchmark by synbol

Awesome-Parameter-Efficient-Transfer-Learning by synbol

object-centric-ovd by hanoonaR

CIoU by Zzh-tju

X-VLM by zengyan-97

BCNet by lkeab

Meta-DETR by ZhangGongjie

DenseMatching by PruneTruong

yolox-pytorch by bubbliiiing

recognize-anything by xinyu1205

deep-high-resolution-net.pytorch by leoxiaobin

pytorch-image-models by huggingface