s3 by pat-jj

RL framework for training efficient search agents in RAG

Created 6 months ago

778 stars

Top 45.0% on SourcePulse

Project Summary

s3 is a framework for training efficient search agents for Retrieval-Augmented Generation (RAG) tasks. It targets researchers and practitioners looking to improve RAG performance by optimizing the retrieval component, enabling effective search with significantly less training data compared to prior methods. The primary benefit is achieving strong QA performance by focusing solely on search agent training, without altering the generator LLM.

How It Works

s3 employs Reinforcement Learning (RL) to train language models to become more effective search agents. The core idea is to optimize the search strategy directly, allowing it to learn how to retrieve relevant documents efficiently. This approach is advantageous as it isolates the search problem, enabling targeted improvements and reducing the data requirements typically associated with training large models. The framework is designed to be modular and compatible with any black-box LLM.

Quick Start & Requirements

Installation: Requires Python 3.9 for the searcher/generator environment and Python 3.10 for the retriever environment. Key dependencies include torch (v2.4.0 with CUDA 12.1), vllm (v0.6.3), ray, flash-attn, pyserini, wandb, IPython, matplotlib, huggingface_hub, faiss-gpu (v1.8.0), uvicorn, and fastapi.
Preparation: Involves downloading an index and corpus, precomputing a naive RAG cache, and deploying both the retriever and generator.
Resources: Precomputing the naive RAG cache is noted as a time-consuming step. Specific GPU requirements are implied by CUDA and faiss-gpu dependencies.
Links: Installation, Preparation, Run Training, Run Search/Retrieval, Run Evaluation.

Highlighted Details

Achieves strong performance with a fraction of the data used by prior methods.
Focuses solely on training the search component of RAG.
Modular design compatible with any black-box LLM.
Supports multiple baseline retrieval methods (RAG, DeepRetrieval, Search-R1, IRCoT, Search-o1) for comparison.

Maintenance & Community

The project acknowledges contributions from verl, RAGEN, Search-R1, DeepRetrieval, and PySerini. No specific community channels (like Discord/Slack) or roadmap links are provided in the README.

Licensing & Compatibility

The project's license is not explicitly stated in the provided README snippet. Compatibility for commercial use or closed-source linking would require clarification of the license.

Limitations & Caveats

The README indicates that precomputing the naive RAG cache "will take a while," suggesting a potentially significant upfront time investment. Specific hardware requirements beyond CUDA 12.1 and GPU support for FAISS are not detailed, and the project appears to be research-oriented with a recent arXiv publication date.

s3 by pat-jj

Explore Similar Projects

Awesome-RAG by liunian-Jay

ASearcher by inclusionAI

Rankify by DataScienceUIBK

gritlm by ContextualAI

ANCE by microsoft

Adaptive-RAG by starsuzi

RAG-FiT by IntelLabs

HiRAG by hhy-huang

Awesome-LLM-RAG by jxzhangjhu

rag-all-techniques by liu673

elasticsearch-labs by elastic

bootcamp by milvus-io