RAG-Retrieval by NovaSearch-Team

End-to-end code for RAG retrieval model training, inference, and distillation

Created 1 year ago

1,076 stars

Top 35.2% on SourcePulse

Project Summary

This project provides an end-to-end solution for training, inference, and distillation of Retrieval-Augmented Generation (RAG) retrieval models, including embedding, ColBERT, and reranker components. It targets researchers and developers working with RAG systems, offering unified code and support for various open-source models, with a focus on efficient fine-tuning and distillation from large to small models.

How It Works

The framework supports fine-tuning of diverse RAG retrieval models: embedding models (BERT-based, LLM-based), late interaction models (ColBERT), and reranker models (BERT-based, LLM-based). It leverages advanced algorithms like MRL loss for dimensionality reduction in embedding models and supports multi-GPU training strategies via DeepSpeed and FSDP. For inference, a lightweight Python library rag-retrieval offers a unified interface for various reranker models, including specific logic for handling long documents.

Quick Start & Requirements

Training: conda create -n rag-retrieval python=3.8, activate environment, pip install -r requirements.txt. Manual PyTorch/CUDA version installation recommended.
Inference (reranker): pip install rag-retrieval. Manual PyTorch/CUDA version installation recommended.
Prerequisites: Python 3.8+, CUDA (version compatibility recommended).
Docs: Tutorial

Highlighted Details

Supports distillation of LLM-based rerankers to BERT-based models.
Achieves competitive performance on MTEB Reranking tasks, with a custom rag-retrieval-reranker model showing strong results.
Implements LLM preference-based supervised fine-tuning for RAG retrievers.
Includes implementation of MRL loss for embedding models.

Maintenance & Community

Recent updates include core training code for Stella and Jasper embedding models (distillation of SOTA models), LLM-based reranker methods, and MRL loss implementation.
Community engagement via WeChat group.

Licensing & Compatibility

Licensed under the MIT License.
Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The README notes that performance improvements from fine-tuning open-source models with existing general datasets might be limited, suggesting vertical field datasets yield greater gains.

RAG-Retrieval by NovaSearch-Team

Explore Similar Projects

RAGTune by misbahsy

ANCE by microsoft

FlashRank by PrithivirajDamodaran

Rankify by DataScienceUIBK

pyterrier by terrier-org

rank_llm by castorini

Qwen3-Embedding by QwenLM

BCEmbedding by netease-youdao

Local_Pdf_Chat_RAG by weiwill88

Awesome-Deep-Learning-Papers-for-Search-Recommendation-Advertising by guyulongcs

FlagEmbedding by FlagOpen

sentence-transformers by huggingface