Discover and explore top open-source AI tools and projects—updated daily.
THUDMAdvanced deep search agents for complex information retrieval
Top 91.8% on SourcePulse
DeepDive addresses the challenge of training sophisticated deep search agents capable of complex, multi-step information retrieval. It targets researchers and engineers by automating data synthesis from knowledge graphs and employing multi-turn reinforcement learning for web browsing agents, enabling advanced long-horizon reasoning.
How It Works
The project employs a two-stage approach. Stage 1 automates data synthesis by performing knowledge graph random walks, obfuscating entities with LLMs to create generalized "blurry entities," and filtering for questions that even advanced models struggle with. Stage 2 trains agents end-to-end using multi-turn Reinforcement Learning (GRPO) with strict binary rewards, ensuring high-quality search trajectories and enhancing long-horizon reasoning and web browsing capabilities.
Quick Start & Requirements
The project's models and core code are still under development, with an expected release date for models and code to be announced. Key components like the dataset are available on Hugging Face. Significant computational resources, likely including GPUs and CUDA, are anticipated for training and inference.
Highlighted Details
Maintenance & Community
The project has seen recent updates with the release of its complete data construction pipeline and QA pairs/SFT trajectories. It is developed by THUDM. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The license type and any compatibility notes for commercial use or closed-source linking are not specified in the provided README.
Limitations & Caveats
The core models and code are still under development and not yet released. While competitive, DeepDive's performance on benchmarks like BrowseComp is still lower than leading proprietary models. The lack of explicit licensing information presents a potential adoption blocker.
4 months ago
Inactive