WebThinker  by RUC-NLPIR

Research framework for autonomous web search and report drafting

created 4 months ago
1,194 stars

Top 33.5% on sourcepulse

GitHubView on GitHub
Project Summary

WebThinker is a framework that empowers Large Reasoning Models (LRMs) to conduct deep web research autonomously, enabling them to search, explore web pages, and draft research reports within their thinking process. It targets researchers and users needing in-depth, automated information gathering and report generation, offering an end-to-end solution that integrates knowledge acquisition directly into the LRM's reasoning.

How It Works

WebThinker utilizes a "Think-Search-and-Draft" strategy, allowing LRMs to interact with the web. A Deep Web Explorer enables models to perform searches, navigate pages by clicking elements, and extract information. The framework supports autonomous follow-up searches and deeper link traversal. For report generation, LRMs are equipped with tools for drafting, checking, and editing report sections, ensuring coherence and adaptability. RL-based training strategies are being developed to optimize performance using preference pairs from complex tasks.

Quick Start & Requirements

Highlighted Details

  • Outperforms RAG-based agents on knowledge-intensive benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and open-ended report generation.
  • Enables LRMs to autonomously search, navigate, and extract information from web pages.
  • Integrates real-time knowledge seeking with report creation using specialized drafting, checking, and editing tools.
  • Supports multiple model sizes and types for reasoning and auxiliary tasks.

Maintenance & Community

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and integration with closed-source systems.

Limitations & Caveats

The project is actively developing RL-based training strategies, suggesting ongoing research and potential for future improvements or changes. Specific model serving configurations (vLLM) and API keys (Bing) are required for operation.

Health Check
Last commit

4 days ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
795 stars in the last 90 days

Explore Similar Projects

Starred by Jason Liu Jason Liu(Author of Instructor) and Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code).

Search-R1 by PeterGriffinJin

1.1%
3k
RL framework for training LLMs to use search engines
created 5 months ago
updated 3 weeks ago
Feedback? Help us improve.