OpenDeepResearcher by mshumer

AI researcher notebook for iterative information gathering

Created 11 months ago

2,748 stars

Top 17.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Project Summary

OpenDeepResearcher is an AI-powered research assistant that iteratively searches the web to gather comprehensive information on a user-defined topic. It's designed for researchers, students, or anyone needing to synthesize information from multiple online sources, automating the process of query generation, information retrieval, and context extraction.

How It Works

The system employs an iterative research loop driven by an LLM. It begins by generating initial search queries, then concurrently executes these via SERPAPI. Retrieved links are deduplicated, and their content is fetched and evaluated for relevance using Jina and an LLM. This aggregated context is then fed back to the LLM to determine if further searches are necessary, refining the process until sufficient information is gathered or an iteration limit is met. The final report is synthesized from all extracted context.

Quick Start & Requirements

Install: Clone the repository or open the provided Google Colab notebook.
Prerequisites: API access and keys for OpenRouter, SERPAPI, and Jina.
Setup: Install nest_asyncio and configure API keys within the notebook.
Usage: Run notebook cells sequentially, providing a research query and optional iteration limit.
Links: Google Colab Notebook

Highlighted Details

Iterative, LLM-driven query refinement.
Asynchronous processing for concurrent searches and data fetching.
LLM-powered relevance evaluation and context extraction.
Gradio interface available for a functional UI.

Maintenance & Community

The project is maintained by mshumer, who can be followed on X for updates.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Requires multiple third-party API keys, which may incur costs or have rate limits. The effectiveness is dependent on the quality of the LLM and the accuracy of SERPAPI results.

Health Check

Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days