open_deep_research by togethercomputer

Agentic LLM workflow for in-depth research on complex topics

Created 9 months ago

355 stars

Top 78.8% on SourcePulse

View on GitHub

4 Experts Love This Project

Jeffrey Wang

Cofounder of Exa

Vincent Weisser

Cofounder of Prime Intellect

Jeff Hammerbacher

Cofounder of Cloudera

Dan Guido

Cofounder of Trail of Bits

Project Summary

This project provides an agentic LLM workflow for comprehensive, multi-hop reasoning research on complex topics, mimicking human research processes. It's designed for researchers, students, and anyone needing in-depth, well-cited content, enhancing traditional web search with structured information gathering and source verification.

How It Works

The system employs a multi-stage, agentic LLM process that plans, searches, evaluates, and iterates to produce detailed research reports. It leverages multiple self-reflection stages to ensure quality information gathering and includes source verification with citations for all information. The architecture is designed for extensibility, allowing community contributions.

Quick Start & Requirements

Install: Use uv (a faster alternative to pip) for installation.

# Create and activate virtual environment
uv venv --python=3.12
source .venv/bin/activate
# Install project dependencies
uv pip install -r pyproject.toml

Prerequisites: Python 3.12+, Pandoc, pdfLaTeX (via BasicTeX on macOS or texlive-xetex on Ubuntu).
API Keys: Requires TOGETHER_API_KEY, TAVILY_API_KEY, and HUGGINGFACE_TOKEN.
Usage: Run via CLI (python src/together_open_deep_research.py --config configs/open_deep_researcher_config.yaml) or Gradio webapp (python src/webapp.py).
Docs: Overview, Features, Installation, Usage.

Highlighted Details

Generates long-form, well-cited research reports.
Supports output formats including PDF, HTML, and Podcast.
Includes source verification and citations for all information.
Extensible architecture for community contributions.

Maintenance & Community

The project is from Together Computer. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The README does not specify a license. Users should verify licensing terms before use, especially for commercial applications.

Limitations & Caveats

As an LLM-based system, it may generate hallucinations, exhibit biases from training data, misinterpret queries, or present outdated information. Users are advised to always verify critical information with primary sources.

Health Check

Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days