firestarter  by firecrawl

AI chatbots for any website

Created 3 months ago
485 stars

Top 63.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Firestarter enables users to instantly create AI chatbots for any website, powered by Retrieval-Augmented Generation (RAG). It targets developers and power users looking to transform website content into interactive, queryable knowledge bases with a streaming chat interface and an OpenAI-compatible API.

How It Works

Firestarter employs a two-phase process. First, it uses Firecrawl to intelligently scrape and aggregate website content into clean Markdown. This content is then indexed into Upstash Search, a vector database, which automatically handles chunking and vector embedding, storing data under unique namespaces for isolation. Second, user queries trigger a RAG pipeline: semantic search in Upstash retrieves relevant context, which is combined with the query in a prompt for an LLM (OpenAI, Groq, Anthropic). Responses are streamed back via the Vercel AI SDK.

Quick Start & Requirements

  • Install: Clone the repository, create a .env.local file with API keys for Firecrawl, Upstash, and an LLM provider (OpenAI default), then run npm install or yarn install.
  • Run: Execute npm run dev or yarn dev.
  • Prerequisites: API keys for Firecrawl, Upstash, and at least one LLM provider (OpenAI, Groq, Anthropic).
  • Demo: http://localhost:3000

Highlighted Details

  • Creates a fully functional chat interface and an OpenAI-compatible API endpoint for each website.
  • Supports flexible LLM provider integration (Groq, OpenAI, Anthropic) with configurable priority.
  • Allows customization of crawl depth and chatbot creation features.
  • Indexes are persistent, and chatbots can be accessed anytime.

Maintenance & Community

The project is open source under the MIT License and welcomes contributions via pull requests. Issues can be raised in the repository for support.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The default crawl limit is 10 pages, though this can be increased in the configuration for self-hosted versions. The system relies on external API keys for core functionality.

Health Check
Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
30 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.