Domain-specific chat completions app
Top 39.9% on sourcepulse
This project provides a starter application for building domain-specific chatbots using OpenAI's GPT models and pgvector for efficient similarity search. It targets developers looking to augment LLMs with custom knowledge bases, enabling more accurate and verifiable responses by grounding them in provided documents.
How It Works
The application scrapes web pages, extracts plain text, and segments it into manageable chunks. Each chunk is then converted into a vector embedding using OpenAI's text-embedding-ada-002
model. These embeddings, along with the original text and source URL, are stored in a Supabase PostgreSQL database with the pgvector
extension. When a user queries the system, their prompt is also embedded, and a similarity search is performed against the vector database to retrieve the most relevant document chunks. These chunks are then incorporated into a prompt sent to the OpenAI Chat Completions API, ensuring the LLM's response is informed by the domain-specific data.
Quick Start & Requirements
npm install
npm run dev
pgvector
support..env.local
with Supabase URL, Supabase Anon Key, and OpenAI API Key.Highlighted Details
pgvector
extension for vector storage and search.Maintenance & Community
The repository is maintained by gannonh. No specific community channels or roadmap links are provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license in the README. This requires further investigation for commercial use or integration into closed-source projects.
Limitations & Caveats
The project is presented as a starter app, implying it may require further development for production readiness. The dependency on Supabase and OpenAI APIs means costs associated with their usage. The effectiveness is highly dependent on the quality of scraped data and the chosen embedding model.
2 years ago
Inactive