Collection of papers on pre-trained models for information retrieval
Top 51.3% on sourcepulse
This repository is a curated list of academic papers focusing on pre-trained models for Information Retrieval (IR). It serves researchers and practitioners in the IR field by organizing key publications across various sub-topics, including sparse, dense, and hybrid retrieval, re-ranking, and the integration of Large Language Models (LLMs) with IR. The primary benefit is providing a structured overview of the rapidly evolving landscape of pre-trained models in IR.
How It Works
The list categorizes papers based on their contribution to IR, such as specific retrieval techniques (e.g., sparse representation learning, hard negative sampling for dense retrieval), architectural innovations (e.g., multi-vector representations, long document processing), and emerging trends like LLM-augmented retrieval. It provides links to papers and, where available, associated code repositories, enabling users to quickly access and explore relevant research.
Quick Start & Requirements
This is a curated list of papers and does not have a direct installation or execution command. Users will need to access the linked papers and code repositories independently.
Highlighted Details
Maintenance & Community
The repository is maintained by ict-bigdatalab and welcomes contributions via Pull Requests. Feedback and suggestions are encouraged.
Licensing & Compatibility
The repository itself is a list of links and does not impose a license on the content it references. Individual papers and code repositories will have their own respective licenses.
Limitations & Caveats
As a curated list, its completeness is dependent on community contributions. The rapid pace of research means new papers may not be immediately included. Users must consult individual paper licenses for usage restrictions.
1 year ago
1 week