Research paper code for robust fine-tuning of zero-shot models
Top 48.6% on sourcepulse
This repository provides WiSE-FT, a method for robustly fine-tuning large zero-shot models like CLIP. It addresses the common issue where standard fine-tuning degrades out-of-distribution (OOD) accuracy. WiSE-FT is designed for researchers and practitioners working with vision-language models who need to adapt them to specific tasks while maintaining or improving robustness across different data distributions.
How It Works
WiSE-FT achieves robustness by ensembling the weights of the original zero-shot model and a standard fine-tuned model. This interpolation is performed using a mixing coefficient alpha
, effectively creating a convex combination of the two weight sets. This approach preserves the generalization capabilities of the zero-shot model while incorporating task-specific knowledge from fine-tuning, leading to improved OOD performance without additional computational cost during inference or fine-tuning.
Quick Start & Requirements
conda env create -f environment.yml
and conda activate wiseft
. Add directory to PYTHONPATH: export PYTHONPATH="$PYTHONPATH:$PWD"
.datasets.md
.python src/wise_ft.py --load=models/zeroshot.pt,models/finetuned.pt --eval-datasets=... --alpha 0 0.1 ...
python src/wise_ft.py --train-dataset=ImageNet --model=ViT-B/32 --eval-datasets=... --alpha 0 0.1 ...
python src/scatter_plot.py --results-db=results.jsonl --save plots
Highlighted Details
Maintenance & Community
The project is associated with the paper "Robust fine-tuning of zero-shot models" by multiple authors from institutions including the Allen Institute for AI. No specific community channels (Discord/Slack) or roadmap are explicitly mentioned in the README.
Licensing & Compatibility
The repository does not explicitly state a license. The code is provided for research purposes, and commercial use would require careful review of any associated licenses or terms of use from the underlying models or datasets.
Limitations & Caveats
The README does not specify any limitations or known bugs. The effectiveness of WiSE-FT may depend on the quality and compatibility of the zero-shot and fine-tuned model checkpoints used. The setup requires downloading specific datasets, which may be substantial.
3 years ago
1 day