open-deep-research  by nickscamara

Open-source AI agent for web research

created 6 months ago
5,913 stars

Top 8.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an open-source implementation of an AI agent for deep web research, mimicking OpenAI's Deep Research experiment. It targets developers and researchers seeking to leverage AI for comprehensive web data analysis, offering a real-time, search-and-extract approach powered by Firecrawl and a reasoning model.

How It Works

The system utilizes Firecrawl for web scraping and data extraction, feeding real-time data into an AI model for analysis. It employs a Next.js App Router architecture with React Server Components for efficient rendering and server-side logic. The AI SDK supports multiple LLM providers (OpenAI, Anthropic, Cohere) and enables dynamic chat interfaces, with OpenAI's GPT-4o as the default. A separate reasoning model, configurable via environment variables, handles complex analysis and structured data output.

Quick Start & Requirements

  • Install dependencies: pnpm install
  • Run migrations: pnpm db:migrate
  • Start the app: pnpm dev
  • Requires Node.js and pnpm.
  • Vercel CLI recommended for deployment and environment variable management.
  • Official demo available: https://deep-research.vercel.app/

Highlighted Details

  • Leverages Firecrawl for robust web data extraction and search.
  • Supports multiple LLM providers via the AI SDK, including OpenAI, Anthropic, and Cohere.
  • Built with Next.js App Router, React Server Components, and shadcn/ui for a modern stack.
  • Integrates Vercel Postgres (Neon) for data persistence and Vercel Blob for file storage.
  • Offers configurable reasoning models for advanced analysis, with options for JSON schema validation.

Maintenance & Community

The project is maintained by nickscamara. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The project's licensing is not specified in the README. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

Function execution timeout is limited to 300 seconds by default, requiring adjustment to 60 seconds for Vercel Hobby tier users. Using non-OpenAI models for reasoning may require disabling JSON schema validation, potentially impacting response structure.

Health Check
Last commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
513 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.