PromptSRC  by muzairkhattak

Vision-language prompt learning research paper

created 2 years ago
270 stars

Top 95.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official implementation for PromptSRC, a self-regulating framework for adapting foundational vision-language models like CLIP to downstream tasks without sacrificing their generalizability. It targets researchers and practitioners in computer vision and natural language processing seeking to improve prompt learning efficiency and performance.

How It Works

PromptSRC addresses the common issue of prompt learning methods overfitting to downstream tasks, leading to a loss of CLIP's inherent generalization capabilities. It employs a three-pronged self-regularization approach: maximizing mutual agreement between prompted and frozen model features, using Gaussian-weighted self-ensembling of prompts over training, and incorporating textual diversity to balance visual and textual branches. This strategy aims to jointly optimize for task-specific performance and task-agnostic representations.

Quick Start & Requirements

  • Installation: Follow instructions in INSTALL.md.
  • Data Preparation: Follow instructions in DATASETS.md.
  • Evaluation: Refer to EVAL.md for reproducing results with pre-trained models.
  • Training: Refer to TRAIN.md for training from scratch.
  • Dependencies: Not explicitly detailed in the README, but likely requires PyTorch and CLIP.

Highlighted Details

  • Achieves state-of-the-art performance on base-to-novel generalization across 11 image recognition datasets.
  • Outperforms existing methods like CoOp, CoCoOp, ProDA, and MaPLe on various benchmarks.
  • Supports MaPLe, CoOp, and Co-CoOp architectures.
  • Provides pre-trained models and evaluation codes.

Maintenance & Community

Licensing & Compatibility

  • The README does not explicitly state a license. However, the project is based on other repositories which may have their own licenses. Users should verify licensing for commercial use.

Limitations & Caveats

  • Detailed installation and dependency information is deferred to separate markdown files, requiring additional steps to assess system requirements.
  • The specific license is not mentioned, which could be a concern for commercial adoption.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.