wise-ft by mlfoundations

Research paper code for robust fine-tuning of zero-shot models

Created 4 years ago

757 stars

Top 46.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Jesse Clark

Cofounder of Marqo

Ross Wightman

Author of timm; CV at Hugging Face

Project Summary

This repository provides WiSE-FT, a method for robustly fine-tuning large zero-shot models like CLIP. It addresses the common issue where standard fine-tuning degrades out-of-distribution (OOD) accuracy. WiSE-FT is designed for researchers and practitioners working with vision-language models who need to adapt them to specific tasks while maintaining or improving robustness across different data distributions.

How It Works

WiSE-FT achieves robustness by ensembling the weights of the original zero-shot model and a standard fine-tuned model. This interpolation is performed using a mixing coefficient alpha, effectively creating a convex combination of the two weight sets. This approach preserves the generalization capabilities of the zero-shot model while incorporating task-specific knowledge from fine-tuning, leading to improved OOD performance without additional computational cost during inference or fine-tuning.

Quick Start & Requirements

Install: conda env create -f environment.yml and conda activate wiseft. Add directory to PYTHONPATH: export PYTHONPATH="$PYTHONPATH:$PWD".
Prerequisites: Python, Conda, PyTorch. Specific dataset downloads are detailed in datasets.md.
Running WiSE-FT:
- From existing checkpoints: python src/wise_ft.py --load=models/zeroshot.pt,models/finetuned.pt --eval-datasets=... --alpha 0 0.1 ...
- From scratch (e.g., ViT-B/32): python src/wise_ft.py --train-dataset=ImageNet --model=ViT-B/32 --eval-datasets=... --alpha 0 0.1 ...
Plotting: python src/scatter_plot.py --results-db=results.jsonl --save plots
Documentation: https://arxiv.org/abs/2109.01903

Highlighted Details

Improves OOD accuracy by 4-6 pp over prior work on ImageNet distribution shifts.
Achieves 2-23 pp robustness improvements on diverse distribution shifts.
Preserves or improves in-distribution accuracy.
No additional computational cost during fine-tuning or inference.

Maintenance & Community

The project is associated with the paper "Robust fine-tuning of zero-shot models" by multiple authors from institutions including the Allen Institute for AI. No specific community channels (Discord/Slack) or roadmap are explicitly mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The code is provided for research purposes, and commercial use would require careful review of any associated licenses or terms of use from the underlying models or datasets.

Limitations & Caveats

The README does not specify any limitations or known bugs. The effectiveness of WiSE-FT may depend on the quality and compatibility of the zero-shot and fine-tuned model checkpoints used. The setup requires downloading specific datasets, which may be substantial.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days