open_deep_research by langchain-ai

Open-source research assistant for automated deep research, generating comprehensive reports

Created 1 year ago

10,150 stars

Top 5.0% on SourcePulse

View on GitHub

4 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Elvis Saravia

Founder of DAIR.AI

Philipp Schmid

DevRel at Google DeepMind

Rotem Weiss

Cofounder of Tavily

Project Summary

This project provides an experimental, open-source research assistant designed to automate deep research and generate comprehensive reports on any topic. It offers two distinct implementations—a structured workflow and a parallel multi-agent architecture—allowing users to customize models, prompts, report structure, and search tools for tailored research outcomes.

How It Works

The project offers two primary architectures: a graph-based workflow and a multi-agent system. The workflow implementation follows a plan-and-execute model, with a distinct planning phase, human-in-the-loop review for the report plan, and sequential section generation with reflection. This approach emphasizes user control and report accuracy. The multi-agent implementation uses a supervisor-researcher model where multiple agents work in parallel to research and write sections simultaneously, prioritizing speed and efficiency.

Quick Start & Requirements

Install dependencies and launch the LangGraph server locally using uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking (Mac) or pip install -e . and langgraph dev (Windows/Linux).
Requires Python 3.11.
Supports various LLMs via init_chat_model() and multiple search tools (Tavily, Perplexity, Exa, ArXiv, PubMed, Linkup, DuckDuckGo, Google Search).
Setup involves cloning the repo, copying .env.example to .env, and configuring API keys and model choices.
Links: LangGraph Studio UI, API Docs.

Highlighted Details

Two distinct implementations: Graph-based Workflow and Multi-Agent.
Supports a wide array of LLMs and search tools, with specific configurations for Exa, ArXiv, and PubMed.
Allows for detailed customization of report structure, query generation, and model choices for planning and writing.
Includes testing scripts to compare report quality across different model configurations and implementations.

Maintenance & Community

Developed by Langchain-ai.
Further community and roadmap information can be found via the GitHub repository.

Licensing & Compatibility

The project is open-source, with specific licensing details not explicitly stated in the README but implied to be permissive for research and development. Compatibility for commercial use would require verification of the specific license applied to the repository.

Limitations & Caveats

The multi-agent implementation is currently limited to Tavily Search. Model selection is critical, as planner and writer models need to support structured outputs, and agent models require robust tool-calling capabilities; models like deepseek-R1 are noted as weak in function calling. Some LLMs may have token-per-minute limits (e.g., Groq on-demand tier).

Health Check

Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

289 stars in the last 30 days