ZeroSearch  by Alibaba-NLP

Research paper on incentivizing LLM search without real search engines

Created 4 months ago
1,132 stars

Top 34.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

ZeroSearch is a reinforcement learning framework designed to enhance the search capabilities of Large Language Models (LLMs) by simulating search interactions during training. It targets researchers and developers aiming to improve LLM performance on information retrieval tasks without incurring real search API costs. The framework allows LLMs to learn to generate relevant and even noisy documents, mimicking real-world search results, and progressively improves their reasoning abilities through a curriculum rollout mechanism.

How It Works

ZeroSearch employs a two-stage approach. First, it uses supervised fine-tuning to transform an LLM into a retrieval module that can generate simulated search results. Second, it utilizes reinforcement learning (REINFORCE, GPRO, PPO) to further incentivize the LLM's search behavior. This simulation-based training allows models to learn from a vast number of "searches" without API costs, and a curriculum learning strategy gradually increases the complexity of retrieval scenarios to foster robust reasoning.

Quick Start & Requirements

  • Installation: Requires conda for environment management. Install dependencies via pip and sglang.
    conda create -n zerosearch python=3.9
    conda activate zerosearch
    pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
    pip install vllm==0.6.3
    pip install wandb
    pip install serpapi
    pip install -e .
    pip3 install flash-attn --no-build-isolation
    pip install sglang[all]
    
  • Prerequisites: Python 3.9, PyTorch 2.4.0 with CUDA 12.1, vLLM 0.6.3, wandb, serpapi, flash-attn, and sglang. Requires a Google Search API key for certain configurations.
  • Data/Models: Download training datasets and simulation LLMs from Hugging Face.
  • Resources: Training requires multiple GPUs (e.g., NUM_GPUS_PER_NODE 4).
  • Docs: https://github.com/Alibaba-NLP/ZeroSearch

Highlighted Details

  • Achieves zero API cost for training search-enhanced LLMs.
  • Outperforms models using real search engines in experiments.
  • Generalizes across different LLM sizes and types (base and instruction-tuned).
  • Supports multiple RL algorithms (REINFORCE, GPRO, PPO) and simulation methods (prompt-based, fine-tuning-based).

Maintenance & Community

The project was released in May 2025. Recent updates include new simulation LLMs, tuning datasets, and RL algorithm support. Contact: sunhao@stu.pku.edu.cn.

Licensing & Compatibility

The repository does not explicitly state a license in the README. This may pose compatibility issues for commercial or closed-source use.

Limitations & Caveats

The project is newly released (May 2025) and may be subject to rapid changes. The lack of a specified license requires clarification for any production use. The setup involves multiple complex dependencies and requires significant GPU resources for training.

Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
31 stars in the last 30 days

Explore Similar Projects

Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
6 more.

awesome-o1 by srush

0%
1k
Bibliography for OpenAI's o1 project
Created 11 months ago
Updated 10 months ago
Feedback? Help us improve.