DeepResearcher  by GAIR-NLP

LLM research agent trained with reinforcement learning

created 4 months ago
541 stars

Top 59.7% on sourcepulse

GitHubView on GitHub
Project Summary

DeepResearcher provides a framework for end-to-end training of LLM-based research agents using reinforcement learning (RL) in real-world web environments. It aims to enable agents to perform complex research tasks, exhibiting emergent cognitive behaviors like planning, information cross-validation, and self-reflection, benefiting researchers and advanced users seeking automated deep-dive analysis.

How It Works

The framework leverages reinforcement learning to train LLM agents through authentic web search interactions. This approach allows agents to learn complex research strategies directly from real-world data, leading to emergent capabilities such as planning, information synthesis, and self-correction, which are crucial for robust, real-world research tasks.

Quick Start & Requirements

  • Installation: Clone the repository, create and activate a Conda environment (python=3.10), install PyTorch (torch==2.4.0 with cu124), flash-attn, and the package itself (pip install -e .).
  • Prerequisites: CUDA 12.4 (for PyTorch), Python 3.10, Ray for distributed training, and API keys for search engines (Serper or Azure Bing) and potentially LLM providers (e.g., Qwen-Plus).
  • Setup: Requires configuring API keys in scrl/handler/config.yaml and scrl/handler/server_handler.py, starting a Ray head node, and running server handlers before training or evaluation.
  • Resources: Training and inference involve significant computational resources typical for large language models and RL training.
  • Links: Hugging Face Hub

Highlighted Details

  • Achieves up to 28.9 points improvement over prompt engineering baselines and 7.2 points over RAG-based RL agents on open-domain research tasks.
  • Demonstrates emergent cognitive behaviors including planning, cross-validation, self-reflection, and honesty.
  • Emphasizes the necessity of end-to-end training in real-world web environments for robust research capabilities.

Maintenance & Community

The project is inspired by Deepseek-R1 and built upon veRL and Search-r1. No specific community links (Discord, Slack) or active maintenance signals are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The citation format suggests it is a research artifact, and usage for commercial or closed-source applications would require clarification.

Limitations & Caveats

The setup process is complex, requiring multiple API keys and specific environment configurations. The project is presented as a research artifact, and its stability, long-term maintenance, and production readiness are not detailed.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
4
Star History
216 stars in the last 90 days

Explore Similar Projects

Starred by Jason Liu Jason Liu(Author of Instructor) and Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code).

Search-R1 by PeterGriffinJin

1.1%
3k
RL framework for training LLMs to use search engines
created 5 months ago
updated 3 weeks ago
Feedback? Help us improve.