Multimodal search engine pipeline and benchmark for large multimodal models (LMMs)
Top 67.1% on sourcepulse
MMSearch provides a comprehensive pipeline and benchmark for evaluating Large Multi-modal Models (LMMs) as multimodal search engines. It addresses the gap in standardized evaluation for LMMs in search tasks, offering a framework for researchers and developers to assess and compare model performance in this domain.
How It Works
MMSearch introduces a pipeline, MMSearch-Engine, to enable LMMs to function as multimodal search engines. The benchmark comprises 300 manually curated instances across 14 subfields, designed to avoid overlap with existing LMM training data. Evaluation employs a step-wise strategy, assessing models on requery, rerank, and summarization tasks, culminating in an end-to-end search process. This approach allows for granular understanding of LMM capabilities in different stages of a search query.
Quick Start & Requirements
pip install requirements.txt
and playwright install
.infer
function.scripts/run_end2end.sh
, scripts/run_rerank.sh
, scripts/run_summarization.sh
. Final score calculation: scripts/run_get_final_score.sh
.demo/run_demo_cli.sh
.load_dataset("CaraJ/MMSearch")
from Huggingface.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
6 months ago
1 day