embedding_rerank_retrieval by percent4

RAG evaluation for retrieval algorithms, using LlamaIndex

Created 2 years ago

292 stars

Top 90.5% on SourcePulse

Project Summary

This repository evaluates retrieval techniques and algorithm performance for the Retrieve stage in RAG systems, using LlamaIndex as the primary framework. It targets researchers and engineers working on improving RAG retrieval accuracy and efficiency by comparing various retrieval methods, embedding models, and re-ranking strategies.

How It Works

The project systematically evaluates different retrieval strategies, including BM25, various embedding models (OpenAI, BGE, BGE-Finetune), and ensemble methods. It further assesses the impact of re-ranking using models like Cohere Rerank, BGE-Base Rerank, and BGE-Large Rerank. The evaluation is performed on a custom dataset (data/doc_qa_test.json) and metrics like hit_rate, mrr, and cost_time are reported for each configuration. The project also explores advanced techniques like Query Rewrite and HyDE to enhance retrieval performance.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.9+, LlamaIndex 0.9.21, potentially API keys for embedding models (OpenAI, Cohere).
Data: Requires data/doc_qa_test.json.
Resources: Running evaluations involves significant computation, especially for ensemble methods and re-ranking.

Highlighted Details

Comprehensive benchmark tables comparing BM25, various embedding models (including fine-tuned BGE variants), and ensemble approaches with and without re-ranking.
Detailed analysis of specific query examples demonstrating the strengths and weaknesses of different retrieval methods and the benefits of re-ranking and query rewriting.
Exploration of advanced techniques like HyDE (Hypothetical Document Embeddings) and Late Chunking for improved retrieval.
Includes notebooks for implementing and experimenting with Late Chunking.

Maintenance & Community

The project appears to be a personal research effort, with no explicit mention of maintainers, community channels, or ongoing development beyond the provided README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project focuses on a specific dataset and LlamaIndex version (0.9.21), which may limit generalizability. The extensive benchmark tables, while informative, do not include code for reproducing them directly. The "cost_time" metric for re-ranked ensembles appears exceptionally high, suggesting potential issues or specific experimental conditions.

embedding_rerank_retrieval by percent4

Explore Similar Projects

pyversity by Pringled

denser-retriever by denser-org

llm-rag by labdmitriy

stark by snap-stanford

Rankify by DataScienceUIBK

Awesome-RAG by Danielskry

pyterrier by terrier-org

TrustRAG by gomate-community

Local_Pdf_Chat_RAG by weiwill88

bRAG-langchain by bragai

pyserini by castorini

RAG_Techniques by NirDiamant