drzero by facebookresearch

Self-evolving search agents without training data

Created 6 months ago

524 stars

Top 59.4% on SourcePulse

Project Summary

Dr. Zero introduces a framework for self-evolving search agents that operate without requiring pre-existing training data. It targets researchers and engineers developing AI agents, offering a method to bootstrap complex reasoning and search capabilities through an automated, data-free curriculum, thereby matching or surpassing supervised approaches with reduced computational overhead.

How It Works

The core approach involves an iterative self-evolution loop between two agents: a proposer and a solver. The proposer generates diverse, increasingly challenging yet solvable questions, while the solver, initialized from a base model (e.g., Qwen, Llama), learns to answer these questions using a search tool. This process establishes an automated curriculum. To enhance training efficiency, hop-grouped relative policy optimization (HRPO) is employed, clustering structurally similar questions to minimize sampling overhead and reduce compute requirements.

Quick Start & Requirements

Installation: Requires a Python environment with PyTorch, transformers, faiss-gpu, and verl==0.5.0. Additional dependencies are listed within the repository.
Search Engine Setup: A local server with a retriever model is necessary. This involves downloading the Wikipedia English dump and building a faiss index.
- Download Corpus: python scripts/download.py --save_path ./corpus
- Build Index: cat $save_path/part_* > $save_path/e5_Flat.index
- Decompress Corpus: gzip -d $save_path/wiki-18.jsonl.gz
Initial Data Preparation:
- python process_train.py --local_dir ./data
- python process_test.py --local_dir ./data
Links: Code and scripts are provided within the repository; no separate documentation or demo links are explicitly mentioned.

Highlighted Details

Achieves performance comparable to or exceeding fully supervised search agents, despite operating without any training data.
Employs HRPO to optimize training efficiency by clustering similar questions, significantly reducing compute needs.
Demonstrates that complex reasoning and search capabilities can emerge solely through self-evolutionary processes.

Maintenance & Community

No specific details regarding active contributors, community channels (e.g., Discord, Slack), sponsorships, or a public roadmap were found in the provided README.

Licensing & Compatibility

The code is released under a non-commercial license. This restriction means the project is not suitable for integration into commercial products or services.

Limitations & Caveats

The primary limitation is the non-commercial license, which restricts its use to research and non-profit applications. Setting up the required local search engine, including downloading and indexing a large corpus, represents a significant initial effort.

drzero by facebookresearch

Explore Similar Projects

Awesome-RL-based-Agentic-Search-Papers by ventr1c

awesome-in-context-rl by dunnolab

ToRL by GAIR-NLP

Awesome-Self-Evolving-Agents by XMUDeepLIT

SEAgent by SunzeY

SE-Agent by JARVIS-Xs

Agentic-RAG-R1 by jiangxinke

R-Zero by Chengsong-Huang

Awesome-Agentic-Reasoning by weitianxin

ZeroSearch by Alibaba-NLP

Agent0 by aiming-lab

devin.cursorrules by grapeot