bergen  by naver

Benchmarking RAG systems for question-answering

Created 1 year ago
253 stars

Top 99.4% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> BERGEN is a benchmarking library for Retrieval-Augmented Generation (RAG) systems, primarily targeting question-answering tasks. It addresses the challenge of inconsistent RAG evaluation by providing a reproducible framework. Researchers and engineers benefit from standardized comparisons, component analysis, and strong baseline results across numerous datasets and models.

How It Works

The library facilitates RAG system benchmarking through a flexible, YAML-configurable pipeline comprising retrievers, rerankers, and large language models (LLMs). It supports easy integration of new datasets and models, promoting reproducibility. This modular design allows users to systematically evaluate the impact of individual RAG components and compare different configurations efficiently.

Quick Start & Requirements

Installation requires following a dedicated guide. A typical experiment involves running python3 bergen.py retriever=<name> reranker=<name> generator=<name> dataset=<name>. Prerequisites may include specific Python versions and potentially libraries like vLLM for generation, as indicated by usage examples. Links to the initial paper (arXiv:2407.01102) and a multilingual RAG paper (arXiv:2407.01463) are provided for deeper insights.

Highlighted Details

  • Extensive support for 20+ retrievers, 4 rerankers, and 20+ LLMs.
  • Comprehensive evaluation metrics including Match, EM, and LLMEval, with pairwise LLM-based comparison options.
  • Features multilingual RAG experiment capabilities.
  • Provides established RAG baselines on key datasets (ASQA, NQ, TriviaQA, POPQA, HotPotQA) using models like Llama-2, Mistral, and Solar.

Maintenance & Community

The provided README does not detail specific contributors, sponsorships, partnerships, or community channels (e.g., Discord, Slack), nor does it link to a public roadmap.

Licensing & Compatibility

BERGEN is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0) license. This license strictly prohibits commercial use and requires any derivative works to be shared under the same terms, potentially limiting adoption in commercial products.

Limitations & Caveats

The primary limitation is the CC BY-NC-SA 4.0 license, which restricts commercial application and mandates the same license for derivative works. Detailed installation steps and documentation links are referenced but not directly embedded in the README snippet.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.