Discover and explore top open-source AI tools and projects—updated daily.
IR benchmark for evaluating NLP retrieval models
Top 22.5% on SourcePulse
BEIR (A Heterogeneous Benchmark for Information Retrieval) provides a standardized framework and a diverse collection of 15+ datasets for evaluating the performance of Natural Language Processing (NLP)-based retrieval models. It is designed for researchers and practitioners in information retrieval and NLP who need to assess model effectiveness across various domains and tasks, particularly in zero-shot settings.
How It Works
BEIR offers a unified API for loading datasets, integrating retrieval models (lexical, dense, sparse, and re-ranking), and evaluating their performance using standard metrics like NDCG@k, MAP@K, and Recall@K. It supports various embedding models, including Sentence-BERT and Hugging Face Transformers, with flexible pooling strategies and pre/post-processing options. The framework facilitates reproducible research by providing reference implementations and a common evaluation pipeline.
Quick Start & Requirements
pip install beir
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 months ago
1 week