open-deep-research by nickscamara

Open-source AI agent for web research

Created 11 months ago

6,140 stars

Top 8.3% on SourcePulse

View on GitHub

5 Experts Love This Project

Didier Lopes

Founder of OpenBB

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Eric Ciarla

Cofounder of Firecrawl

and 1 more!

Project Summary

This project provides an open-source implementation of an AI agent for deep web research, mimicking OpenAI's Deep Research experiment. It targets developers and researchers seeking to leverage AI for comprehensive web data analysis, offering a real-time, search-and-extract approach powered by Firecrawl and a reasoning model.

How It Works

The system utilizes Firecrawl for web scraping and data extraction, feeding real-time data into an AI model for analysis. It employs a Next.js App Router architecture with React Server Components for efficient rendering and server-side logic. The AI SDK supports multiple LLM providers (OpenAI, Anthropic, Cohere) and enables dynamic chat interfaces, with OpenAI's GPT-4o as the default. A separate reasoning model, configurable via environment variables, handles complex analysis and structured data output.

Quick Start & Requirements

Install dependencies: pnpm install
Run migrations: pnpm db:migrate
Start the app: pnpm dev
Requires Node.js and pnpm.
Vercel CLI recommended for deployment and environment variable management.
Official demo available: https://deep-research.vercel.app/

Highlighted Details

Leverages Firecrawl for robust web data extraction and search.
Supports multiple LLM providers via the AI SDK, including OpenAI, Anthropic, and Cohere.
Built with Next.js App Router, React Server Components, and shadcn/ui for a modern stack.
Integrates Vercel Postgres (Neon) for data persistence and Vercel Blob for file storage.
Offers configurable reasoning models for advanced analysis, with options for JSON schema validation.

Maintenance & Community

The project is maintained by nickscamara. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The project's licensing is not specified in the README. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

Function execution timeout is limited to 300 seconds by default, requiring adjustment to 60 seconds for Vercel Hobby tier users. Using non-OpenAI models for reasoning may require disabling JSON schema validation, potentially impacting response structure.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

31 stars in the last 30 days