drzero  by facebookresearch

Self-evolving search agents without training data

Created 1 month ago
474 stars

Top 64.5% on SourcePulse

GitHubView on GitHub
Project Summary

Dr. Zero introduces a framework for self-evolving search agents that operate without requiring pre-existing training data. It targets researchers and engineers developing AI agents, offering a method to bootstrap complex reasoning and search capabilities through an automated, data-free curriculum, thereby matching or surpassing supervised approaches with reduced computational overhead.

How It Works

The core approach involves an iterative self-evolution loop between two agents: a proposer and a solver. The proposer generates diverse, increasingly challenging yet solvable questions, while the solver, initialized from a base model (e.g., Qwen, Llama), learns to answer these questions using a search tool. This process establishes an automated curriculum. To enhance training efficiency, hop-grouped relative policy optimization (HRPO) is employed, clustering structurally similar questions to minimize sampling overhead and reduce compute requirements.

Quick Start & Requirements

  • Installation: Requires a Python environment with PyTorch, transformers, faiss-gpu, and verl==0.5.0. Additional dependencies are listed within the repository.
  • Search Engine Setup: A local server with a retriever model is necessary. This involves downloading the Wikipedia English dump and building a faiss index.
    • Download Corpus: python scripts/download.py --save_path ./corpus
    • Build Index: cat $save_path/part_* > $save_path/e5_Flat.index
    • Decompress Corpus: gzip -d $save_path/wiki-18.jsonl.gz
  • Initial Data Preparation:
    • python process_train.py --local_dir ./data
    • python process_test.py --local_dir ./data
  • Links: Code and scripts are provided within the repository; no separate documentation or demo links are explicitly mentioned.

Highlighted Details

  • Achieves performance comparable to or exceeding fully supervised search agents, despite operating without any training data.
  • Employs HRPO to optimize training efficiency by clustering similar questions, significantly reducing compute needs.
  • Demonstrates that complex reasoning and search capabilities can emerge solely through self-evolutionary processes.

Maintenance & Community

No specific details regarding active contributors, community channels (e.g., Discord, Slack), sponsorships, or a public roadmap were found in the provided README.

Licensing & Compatibility

The code is released under a non-commercial license. This restriction means the project is not suitable for integration into commercial products or services.

Limitations & Caveats

The primary limitation is the non-commercial license, which restricts its use to research and non-profit applications. Setting up the required local search engine, including downloading and indexing a large corpus, represents a significant initial effort.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
209 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google) and Yiran Wu Yiran Wu(Coauthor of AutoGen).

Self-Evolving-Agents by CharlesQ9

1.6%
909
Survey of self-evolving AI agents
Created 7 months ago
Updated 4 months ago
Feedback? Help us improve.