Awesome-Composed-Image-Retrieval  by haokunwen

Collection of Composed Image Retrieval (CIR) papers and datasets

created 1 year ago
251 stars

Top 99.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of research papers on Composed Image Retrieval (CIR), a field focused on retrieving images based on complex, multi-faceted queries combining visual and textual information. It targets researchers and practitioners in computer vision and information retrieval, offering a comprehensive overview of the state-of-the-art and a structured categorization of CIR techniques.

How It Works

The collection categorizes CIR research into distinct sub-fields, including attribute-based, few-shot, zero-shot, semi-supervised, and conversational CIR, alongside related areas like Composed Video Retrieval (COVR) and sketch-based CIR. It provides links to papers, often with arXiv pre-prints and conference proceedings, facilitating access to foundational and recent advancements in the field. The repository also includes dataset statistics, detailing modalities, scale, and domain for various CIR benchmarks.

Quick Start & Requirements

This repository is a collection of research papers and does not have a direct installation or execution command. Users are expected to access the linked papers for their respective implementations and requirements.

Highlighted Details

  • Comprehensive survey paper available on arXiv, analyzing over 120 papers from 2017-2024.
  • Categorization covers 10 distinct areas of Composed Image Retrieval.
  • Detailed statistics for over 20 datasets, including modalities, scale, and domain.
  • Includes emerging datasets and related research areas like COVR and sketch-based CIR.

Maintenance & Community

The repository is maintained by haokunwen. Further community or development information is not explicitly provided in the README.

Licensing & Compatibility

The licensing of the individual papers linked within this collection varies by their original publication venue. This repository itself does not appear to have a specific license.

Limitations & Caveats

This repository is a curated list of papers and does not provide code implementations or direct access to datasets. Users must follow the links to individual papers to obtain the necessary resources and understand specific project requirements. The "Feb. 2025" date for a survey paper suggests it may be a pre-publication or upcoming work.

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 30 days

Explore Similar Projects

Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Travis Fischer Travis Fischer(Founder of Agentic), and
2 more.

fromage by kohjingyu

0%
482
Multimodal model for grounding language models to images
created 2 years ago
updated 1 year ago
Feedback? Help us improve.