Curated list of prompt/adapter learning methods for VLMs
Top 53.3% on sourcepulse
This repository is a curated list of prompt and adapter learning methods for Vision-Language Models (VLMs) like CLIP. It serves researchers and practitioners by cataloging papers, providing comparative benchmarks, and highlighting open-source implementations, aiming to facilitate efficient adaptation of VLMs for various downstream tasks.
How It Works
The list categorizes methods into prompt learning, test-time prompt learning, and adapter learning, with specific sections for video understanding and continual learning. It includes a comparative table of prompt learning methods, showcasing performance metrics across multiple datasets and indicating the availability of code. The repository emphasizes papers with open-source code, making it a practical resource for empirical evaluation and implementation.
Quick Start & Requirements
This is a curated list, not a runnable software package. To use the methods, users must refer to the individual papers and their associated code repositories, typically requiring Python environments with deep learning frameworks (PyTorch/TensorFlow) and specific VLM implementations. Links to papers and code are provided for each entry.
Highlighted Details
Maintenance & Community
The list is actively maintained by zhengli97, with an invitation for community contributions via email or GitHub issues. The focus is on including papers from top-tier conferences and journals.
Licensing & Compatibility
The repository itself is a list and does not have a specific license. Individual papers and their associated codebases will have their own licenses, which users must adhere to. Compatibility for commercial use depends entirely on the licenses of the referenced code.
Limitations & Caveats
The repository is a collection of links and information; it does not provide a unified API or framework. Users must individually retrieve, set up, and integrate each method. Some listed papers may not have publicly available code or may be in early stages (e.g., arXiv preprints).
1 week ago
1 day