WebThinker by RUC-NLPIR

Research framework for autonomous web search and report drafting

Created 11 months ago

1,404 stars

Top 28.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Casper Hansen

Author of AutoAWQ

Project Summary

WebThinker is a framework that empowers Large Reasoning Models (LRMs) to conduct deep web research autonomously, enabling them to search, explore web pages, and draft research reports within their thinking process. It targets researchers and users needing in-depth, automated information gathering and report generation, offering an end-to-end solution that integrates knowledge acquisition directly into the LRM's reasoning.

How It Works

WebThinker utilizes a "Think-Search-and-Draft" strategy, allowing LRMs to interact with the web. A Deep Web Explorer enables models to perform searches, navigate pages by clicking elements, and extract information. The framework supports autonomous follow-up searches and deeper link traversal. For report generation, LRMs are equipped with tools for drafting, checking, and editing report sections, ensuring coherence and adaptability. RL-based training strategies are being developed to optimize performance using preference pairs from complex tasks.

Quick Start & Requirements

Installation: Create a conda environment (conda create -n webthinker python=3.9), activate it (conda activate webthinker), and install requirements (pip install -r requirements.txt).
Prerequisites: Requires models to be served via vLLM (e.g., QwQ-32B as reasoning model, Qwen-72B-Instruct as auxiliary). A Bing Search API subscription key is necessary. Recommended: Crawl4AI for web parsing.
Demo: A Streamlit demo is available (cd demo; streamlit run_demo.py).
Documentation: Notion page: https://foremost-beechnut-8ed.notion.site/WebThinker-Empowering-Large-Reasoning-Models-with-Deep-Research-Capability-d13158a27d924a4b9df7f9ab94066b64
Paper: https://arxiv.org/abs/2504.21776

Highlighted Details

Outperforms RAG-based agents on knowledge-intensive benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and open-ended report generation.
Enables LRMs to autonomously search, navigate, and extract information from web pages.
Integrates real-time knowledge seeking with report creation using specialized drafting, checking, and editing tools.
Supports multiple model sizes and types for reasoning and auxiliary tasks.

Maintenance & Community

Models are available on Hugging Face.
Paper published on arXiv.
Contact: xiaoxi_li@ruc.edu.cn

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive for commercial use and integration with closed-source systems.

Limitations & Caveats

The project is actively developing RL-based training strategies, suggesting ongoing research and potential for future improvements or changes. Specific model serving configurations (vLLM) and API keys (Bing) are required for operation.

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days