DeepDive by THUDM

Advanced deep search agents for complex information retrieval

Created 10 months ago

332 stars

Top 82.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elvis Saravia

Founder of DAIR.AI

Project Summary

DeepDive addresses the challenge of training sophisticated deep search agents capable of complex, multi-step information retrieval. It targets researchers and engineers by automating data synthesis from knowledge graphs and employing multi-turn reinforcement learning for web browsing agents, enabling advanced long-horizon reasoning.

How It Works

The project employs a two-stage approach. Stage 1 automates data synthesis by performing knowledge graph random walks, obfuscating entities with LLMs to create generalized "blurry entities," and filtering for questions that even advanced models struggle with. Stage 2 trains agents end-to-end using multi-turn Reinforcement Learning (GRPO) with strict binary rewards, ensuring high-quality search trajectories and enhancing long-horizon reasoning and web browsing capabilities.

Quick Start & Requirements

The project's models and core code are still under development, with an expected release date for models and code to be announced. Key components like the dataset are available on Hugging Face. Significant computational resources, likely including GPUs and CUDA, are anticipated for training and inference.

Dataset: https://huggingface.co/datasets/zai-org/DeepDive
Paper: https://arxiv.org/pdf/2509.10446

Highlighted Details

DeepDive-32B achieves 14.8% on the BrowseComp benchmark, outperforming many open-source models and some proprietary ones.
Demonstrates significant test-time scaling: accuracy improves with increased tool calls and parallel sampling strategies.
Leverages automated data synthesis from knowledge graphs (KILT, AMiner) and semi-automated i.i.d. data generation for enhanced performance.
The RL training stage effectively utilizes long tool call horizons, outperforming SFT-only variants.

Maintenance & Community

The project has seen recent updates with the release of its complete data construction pipeline and QA pairs/SFT trajectories. It is developed by THUDM. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

The license type and any compatibility notes for commercial use or closed-source linking are not specified in the provided README.

Limitations & Caveats

The core models and code are still under development and not yet released. While competitive, DeepDive's performance on benchmarks like BrowseComp is still lower than leading proprietary models. The lack of explicit licensing information presents a potential adoption blocker.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days