PyTorch implementation of bilinear CNN for fine-grained image recognition
Top 74.1% on sourcepulse
This repository provides a PyTorch implementation of Bilinear Convolutional Neural Networks (B-CNN) for fine-grained image recognition. It addresses the need for richer image representations by capturing pairwise descriptor interactions, outperforming part-based models without requiring explicit part annotations. The target audience includes researchers and practitioners in computer vision focused on fine-grained classification tasks.
How It Works
B-CNN employs bilinear pooling to compute the sum of outer products of deep image descriptors. This approach captures all pairwise descriptor interactions in a translation-invariant manner, yielding richer representations than linear models. The implementation allows for fine-tuning either just the fully connected layer or all layers of the network.
Quick Start & Requirements
./src/bilinear_cnn_fc.py --base_lr 1.0 --batch_size 64 --epochs 55 --weight_decay 1e-8
./src/bilinear_cnn_all.py --base_lr 1e-2 --batch_size 64 --epochs 25 --weight_decay 1e-5 --model "model.pth"
CUDA_VISIBLE_DEVICES=0,1,2,3
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The repository is written for an outdated PyTorch version (0.3.0), necessitating potential migration efforts for compatibility with modern PyTorch releases.
6 years ago
Inactive