deep_research_from_scratch  by langchain-ai

Deep research agent framework

Created 2 months ago
451 stars

Top 66.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a framework for building a deep research agent from scratch, designed to automate the process of generating comprehensive reports on complex topics. It targets developers and researchers looking to implement or customize advanced AI-powered research systems, offering a modular approach that allows for the integration of custom models and tools. The primary benefit is a configurable and extensible system for automating open-ended research tasks.

How It Works

The system employs a multi-agent, three-phase architecture: Scoping, Research, and Writing. It leverages LangGraph for agent orchestration, enabling flexible workflows and state management. Key components include user clarification for defining research scope, iterative research agents that utilize tools like Tavily search and Model Context Protocol (MCP) servers, and a supervisor agent for coordinating parallel research tasks. This modular design allows for the integration of various LLMs and tools, promoting adaptability and customization.

Quick Start & Requirements

  • Installation: Clone the repository, then use uv sync to install dependencies and create a virtual environment.
  • Prerequisites: Node.js and npx (for MCP server), Python 3.11 or later, uv package manager.
  • Configuration: Create a .env file with API keys for services like Tavily and OpenAI/Anthropic.
  • Running: Use uv run jupyter notebook to run the provided tutorial notebooks.
  • Documentation: Tutorial notebooks are available in the notebooks/ directory.

Highlighted Details

  • Modular design with 5 tutorial notebooks, each building upon the last to construct a full research system.
  • Supports integration of external search tools (e.g., Tavily) and Model Context Protocol (MCP) servers.
  • Implements advanced agent patterns including ReAct loops, supervisor coordination for parallel research, and structured output for reliable decision-making.
  • Focuses on key learning outcomes such as state management, async orchestration, and end-to-end workflow design.

Maintenance & Community

The repository is maintained by langchain-ai. Specific community links (Discord/Slack) or detailed roadmap information are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not specified in the provided README. Compatibility for commercial use or closed-source linking would depend on the unstated license terms.

Limitations & Caveats

The README does not specify any limitations, known bugs, or alpha status. The system requires API keys for several external services, which may incur costs. The setup involves managing multiple dependencies and configurations.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
361 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.