firestarter by firecrawl

AI chatbots for any website

Created 7 months ago

539 stars

Top 58.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Shawn Wang

Editor of Latent Space

Nicolas Camara

Cofounder of Firecrawl

Project Summary

Firestarter enables users to instantly create AI chatbots for any website, powered by Retrieval-Augmented Generation (RAG). It targets developers and power users looking to transform website content into interactive, queryable knowledge bases with a streaming chat interface and an OpenAI-compatible API.

How It Works

Firestarter employs a two-phase process. First, it uses Firecrawl to intelligently scrape and aggregate website content into clean Markdown. This content is then indexed into Upstash Search, a vector database, which automatically handles chunking and vector embedding, storing data under unique namespaces for isolation. Second, user queries trigger a RAG pipeline: semantic search in Upstash retrieves relevant context, which is combined with the query in a prompt for an LLM (OpenAI, Groq, Anthropic). Responses are streamed back via the Vercel AI SDK.

Quick Start & Requirements

Install: Clone the repository, create a .env.local file with API keys for Firecrawl, Upstash, and an LLM provider (OpenAI default), then run npm install or yarn install.
Run: Execute npm run dev or yarn dev.
Prerequisites: API keys for Firecrawl, Upstash, and at least one LLM provider (OpenAI, Groq, Anthropic).
Demo: http://localhost:3000

Highlighted Details

Creates a fully functional chat interface and an OpenAI-compatible API endpoint for each website.
Supports flexible LLM provider integration (Groq, OpenAI, Anthropic) with configurable priority.
Allows customization of crawl depth and chatbot creation features.
Indexes are persistent, and chatbots can be accessed anytime.

Maintenance & Community

The project is open source under the MIT License and welcomes contributions via pull requests. Issues can be raised in the repository for support.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The default crawl limit is 10 pages, though this can be increased in the configuration for self-hosted versions. The system relies on external API keys for core functionality.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days