Curated list of papers on prompting in computer vision and vision-language learning
Top 40.4% on sourcepulse
This repository is a curated list of academic papers focusing on prompting techniques within computer vision and vision-language learning. It serves researchers and practitioners by cataloging key works that leverage prompting for parameter-efficient adaptation of foundation models, enabling zero-shot or few-shot learning capabilities. The collection aims to provide a comprehensive overview of advancements in this rapidly evolving field.
How It Works
The repository categorizes papers based on their prompting approach: "Vision Prompt" for adapting vision foundation models (e.g., ViT), "Vision-Language Prompt" for vision-language models (e.g., CLIP), and "Language-Interactable Prompt" for models that integrate multiple modalities via language interfaces, often for multimodal chatbots. This structured approach allows users to quickly find relevant research based on the type of model and task.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
This repository is a static list of papers and does not provide any direct functionality or code. Users must follow the provided links to access the actual research papers and their associated codebases, which may have varying levels of maturity, documentation, and support.
1 year ago
1 week