Semantic-Retrieval-Models  by caiyinqiong

List of semantic retrieval papers

created 4 years ago
326 stars

Top 84.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated list of academic papers focused on semantic retrieval models for first-stage information retrieval. It serves as a comprehensive resource for researchers and practitioners interested in understanding the evolution and current state of techniques, from classical term-based methods to advanced neural retrieval models. The collection aims to provide a structured overview of key papers, methodologies, and datasets in the field.

How It Works

The repository categorizes papers into distinct eras and approaches: Classical Term-based Retrieval, Early Semantic Retrieval methods (like query expansion and topic models), and modern Neural Methods (Sparse, Dense, and Hybrid Retrieval). It highlights seminal works and recent advancements, offering a historical and technical progression of semantic retrieval research. The structure allows users to trace the development of techniques and identify foundational papers for specific areas.

Quick Start & Requirements

This repository is a curated list of papers and does not require installation or execution. It serves as a reference guide.

Highlighted Details

  • Comprehensive coverage from foundational term-based models (VSM, TFIDF, BM25) to advanced neural methods (DPR, SPLADE, ColBERT).
  • Detailed sections on specific techniques like Query Expansion, Document Expansion, Term Dependency Models, Topic Models, and Translation Models.
  • Includes a dedicated section on Neural Methods, breaking down Sparse, Dense, and Hybrid retrieval approaches with numerous paper references.
  • Lists relevant datasets (MS MARCO, TREC CAR, TREC COVID) and indexing methods (KD-tree, LSH, PQ, HNSW) crucial for implementing retrieval systems.

Maintenance & Community

The repository is maintained by caiyinqiong, with an invitation for feedback and contributions via issues or direct contact.

Licensing & Compatibility

The repository itself is a list of links to academic papers and does not have a specific software license. The licensing of the linked papers is determined by their respective publishers or venues.

Limitations & Caveats

This is a curated list of papers and does not provide code implementations or direct access to the research papers themselves. Users will need to find and access the papers through academic databases or other sources.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

nlp-library by mihail911

0%
1k
NLP papers for practitioners
created 8 years ago
updated 5 years ago
Feedback? Help us improve.