vicreg by facebookresearch

PyTorch code for VICReg self-supervised learning research paper

Created 3 years ago

553 stars

Top 57.8% on SourcePulse

Project Summary

This repository provides the official PyTorch implementation for VICReg, a self-supervised learning method that uses variance, invariance, and covariance regularization to train powerful visual representations. It is targeted at researchers and engineers working on computer vision and deep learning who need robust feature extractors without labeled data.

How It Works

VICReg addresses the collapse problem in self-supervised learning by introducing three regularization terms: variance, invariance, and covariance. Variance regularization ensures that the features have high variance, preventing trivial solutions. Invariance regularization encourages similar representations for augmented views of the same image. Covariance regularization decorrelates features within the embedding, promoting richer representations. This approach is advantageous as it avoids the need for large batch sizes or memory banks, common in other self-supervised methods.

Quick Start & Requirements

Install: pip install torch torchvision submitit (for multi-node training)
Prerequisites: PyTorch 1.8.1+, torchvision 0.9.1+, ImageNet dataset.
Pretrained Models: Available via PyTorch Hub for ResNet-50, ResNet-50 (x2), and ResNet-200 (x2) backbones.
```
import torch
resnet50 = torch.hub.load('facebookresearch/vicreg:main', 'resnet50')
```
Training: Requires ImageNet dataset. Single-node training example: python -m torch.distributed.launch --nproc_per_node=8 main_vicreg.py --data-dir /path/to/imagenet/ --exp-dir /path/to/experiment/ --arch resnet50 --epochs 100 --batch-size 512 --base-lr 0.3
Evaluation: Scripts for linear and semi-supervised evaluation are provided.

Highlighted Details

Offers pretrained ResNet backbones achieving up to 77.3% accuracy on ImageNet linear evaluation.
Supports both single-node and multi-node (SLURM) training configurations.
Includes evaluation scripts for linear classification and semi-supervised fine-tuning.
Built upon the Barlow Twins repository.

Maintenance & Community

This project is from Meta AI. No specific community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The code was developed for specific PyTorch and torchvision versions (1.8.1 and 0.9.1 respectively), though compatibility with newer versions is suggested. Training requires a substantial dataset like ImageNet and significant computational resources (multiple GPUs).

vicreg by facebookresearch

Explore Similar Projects

segformer-pytorch by bubbliiiing

distribuuuu by BIGBALLON

Point-MAE by Pang-Yatian

SLIP by facebookresearch

MATLAB-Deep-Learning-Model-Hub by matlab-deep-learning

Bert-Multi-Label-Text-Classification by lonePatient

ktrain by amaiya

tensorflow-deeplab-v3-plus by rishizek

have-fun-with-machine-learning by humphd

KAIR by cszn

open_clip by mlfoundations

pytorch-image-models by huggingface