Vision-language prompt learning research paper
Top 95.9% on sourcepulse
This repository provides the official implementation for PromptSRC, a self-regulating framework for adapting foundational vision-language models like CLIP to downstream tasks without sacrificing their generalizability. It targets researchers and practitioners in computer vision and natural language processing seeking to improve prompt learning efficiency and performance.
How It Works
PromptSRC addresses the common issue of prompt learning methods overfitting to downstream tasks, leading to a loss of CLIP's inherent generalization capabilities. It employs a three-pronged self-regularization approach: maximizing mutual agreement between prompted and frozen model features, using Gaussian-weighted self-ensembling of prompts over training, and incorporating textual diversity to balance visual and textual branches. This strategy aims to jointly optimize for task-specific performance and task-agnostic representations.
Quick Start & Requirements
INSTALL.md
.DATASETS.md
.EVAL.md
for reproducing results with pre-trained models.TRAIN.md
for training from scratch.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1 day