chat-your-data  by hwchase17

Chatbot app for question answering over custom documents

created 2 years ago
959 stars

Top 39.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project enables users to build a ChatGPT-like interface over their custom documents using LangChain. It's designed for individuals and developers looking to leverage large language models for querying private or specific datasets without exposing them to external services.

How It Works

The system processes user-provided documents by generating embeddings using OpenAI's models and FAISS for efficient similarity search. These embeddings are stored in vectorstore.pkl. When a query is made, custom prompts are used to retrieve relevant information from the vector store, grounding the LLM's response in the ingested data.

Quick Start & Requirements

  • Install requirements: pip install -r requirements.txt
  • Set OpenAI API Key: export OPENAI_API_KEY=<your_key_here>
  • Ingest data: python ingest_data.py
  • Run application: python app.py
  • Prerequisites: Python, OpenAI API key, FAISS.

Highlighted Details

  • Leverages LangChain for LLM orchestration.
  • Utilizes OpenAI Embeddings and FAISS for efficient data indexing and retrieval.
  • Supports custom prompts for grounded responses.

Maintenance & Community

No specific information on contributors, sponsorships, or community channels is provided in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project relies on OpenAI's API, incurring costs and requiring an API key. The README does not mention support for alternative embedding models or vector stores, nor does it detail performance benchmarks or scalability.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 18 hours ago
Feedback? Help us improve.