langchain-supabase-website-chatbot  by mayooear

Chatbot for website Q&A using LLMs and vector DB

created 2 years ago
697 stars

Top 49.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a template for building a website-specific chatbot using LangChain, Supabase, and Next.js. It targets developers looking to integrate AI-powered Q&A capabilities into their web applications, leveraging vector embeddings for efficient information retrieval.

How It Works

The system scrapes specified website URLs using Cheerio, extracts relevant text content and metadata, and converts this data into vector embeddings via OpenAI's text-embedding-ada-002 model. These embeddings are stored in a Supabase PostgreSQL database with the pgvector extension. When a user asks a question, the system retrieves similar document vectors from Supabase and uses LangChain to generate a relevant answer based on the scraped content.

Quick Start & Requirements

  • Install dependencies: pnpm install
  • Set up .env file with OPENAI_API_KEY, NEXT_PUBLIC_SUPABASE_URL, NEXT_PUBLIC_SUPABASE_ANON_KEY, and SUPABASE_SERVICE_ROLE_KEY.
  • Configure website URLs and CSS selectors for scraping in config/ and utils/custom_web_loader.ts.
  • Run scraping and embedding: npm run scrape-embed
  • Run the application: npm run dev
  • Requires Node.js, Supabase account, and OpenAI API key.

Highlighted Details

  • Utilizes Supabase for scalable vector storage with pgvector.
  • Custom web scraping logic allows for flexible content extraction via CSS selectors.
  • Integrates OpenAI embeddings for semantic search capabilities.
  • Built with Next.js for a modern React frontend.

Maintenance & Community

  • Project is maintained by mayooear.
  • Frontend inspired by langchain-chat-nextjs.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

The custom web loader requires manual configuration of CSS selectors for each website, which may be brittle to website structure changes. The project is presented as a template, implying potential need for further development for production readiness.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Feedback? Help us improve.