RL training for LLM query generation to improve information retrieval
Top 55.2% on sourcepulse
DeepRetrieval trains Large Language Models (LLMs) for query generation using reinforcement learning, enabling them to discover optimal search queries through trial and error. This approach eliminates the need for supervised query-augmentation pairs and significantly boosts information retrieval performance across various domains, making it valuable for researchers and developers seeking to enhance search capabilities.
How It Works
The system employs an LLM that generates a reasoning step within a <think>
tag, followed by the final augmented query in an <answer>
tag. This structured output facilitates explicit chain-of-thought reasoning. Retrieval metrics serve as rewards, guiding the LLM to iteratively refine queries for maximum retrieval effectiveness, a novel departure from traditional supervised methods.
Quick Start & Requirements
conda create -n zero python=3.9
, pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
, pip3 install vllm==0.6.3
, pip3 install ray
, cd code
, pip install -e .
, pip3 install flash-attn --no-build-isolation
, pip install wandb
. Additional installs for sparse/dense retrieval: pip install pyserini
, pip install faiss-gpu==1.7.2
. SQL support: pip install func_timeout
.DeepRetrieval/datasets
) or process raw data.critic.model.enable_gradient_checkpointing=True
.Highlighted Details
Maintenance & Community
The project is primarily based on verl
and PySerini
. The base model used in experiments is Qwen2.5-3B-Instruct. Star the repository to stay updated.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is presented as a 2025 arXiv preprint, indicating it may be experimental or pre-release. Specific VRAM requirements for training might necessitate gradient checkpointing.
1 month ago
1 day