Awesome_Prompting_Papers_in_Computer_Vision by ttengwang

Curated list of papers on prompting in computer vision and vision-language learning

Created 4 years ago

928 stars

Top 39.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

This repository is a curated list of academic papers focusing on prompting techniques within computer vision and vision-language learning. It serves researchers and practitioners by cataloging key works that leverage prompting for parameter-efficient adaptation of foundation models, enabling zero-shot or few-shot learning capabilities. The collection aims to provide a comprehensive overview of advancements in this rapidly evolving field.

How It Works

The repository categorizes papers based on their prompting approach: "Vision Prompt" for adapting vision foundation models (e.g., ViT), "Vision-Language Prompt" for vision-language models (e.g., CLIP), and "Language-Interactable Prompt" for models that integrate multiple modalities via language interfaces, often for multimodal chatbots. This structured approach allows users to quickly find relevant research based on the type of model and task.

Quick Start & Requirements

Access: The repository is a curated list of papers, not executable code. Links to papers and associated code repositories are provided.
Requirements: Access to academic papers (may require institutional subscriptions or arXiv access) and potentially the linked code repositories (which will have their own dependencies).

Highlighted Details

Extensive coverage of papers from major conferences (CVPR, ECCV, NeurIPS, ICLR, ICML, ACL, AAAI, ACM MM) and arXiv.
Categorization into Vision Prompt, Vision-Language Prompt, and Language-Interactable Prompt sections.
Includes links to papers and, where available, associated code repositories and demos.
Highlights pilot work contributing to the prevalence of visual prompting.

Maintenance & Community

Maintained by ttengwang.
Links to related resources like "PromptPapers" and "Awesome Multimodal Assistant" are provided for broader context.

Licensing & Compatibility

The repository itself is a list and does not have a specific license.
Linked code repositories will have their own licenses, which must be checked individually for compatibility and usage restrictions.

Limitations & Caveats

This repository is a static list of papers and does not provide any direct functionality or code. Users must follow the provided links to access the actual research papers and their associated codebases, which may have varying levels of maturity, documentation, and support.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days