doc-chatbot  by dissorial

Document chatbot for multi-file Q&A using GPT

created 2 years ago
860 stars

Top 42.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a GPT-powered chatbot for interacting with multiple documents, supporting various file types and chat sessions. It's designed for users who need to query and discuss information contained within their own documents, leveraging LangChain and Pinecone for efficient retrieval and storage.

How It Works

The chatbot processes uploaded documents (.pdf, .docx, .txt) by converting them into embeddings, which are then stored in Pinecone namespaces. When a user asks a question, LangChain retrieves relevant document chunks from Pinecone based on semantic similarity and feeds them to GPT for generating an answer. This approach allows for context-aware responses derived directly from the provided documents.

Quick Start & Requirements

  • Install: yarn install
  • Prerequisites: Node.js, Pinecone account and API key, .env file configured with Pinecone API key, index name, and environment.
  • Setup: Clone the repository, install dependencies, configure .env, and run npm run dev.
  • Docs: https://github.com/dissorial/doc-chatbot

Highlighted Details

  • Supports multiple topics, chat windows, and chat history via local storage.
  • Allows creation, deletion, and management of Pinecone namespaces directly from the browser.
  • Offers an alternative branch (mongodb-and-auth) for Google authentication and MongoDB integration, though it's noted as being behind the main branch.

Maintenance & Community

The project is a fork of mayooear/GPT-4-LangChain with significant modifications. Frontend design is inspired by ChatGPT. No specific community channels or active maintainer information are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Pinecone indexes on the Starter plan are deleted after 7 days of inactivity, requiring periodic API requests to prevent deletion. The project primarily uses local storage for chat history, with an older, less feature-complete branch available for authentication and database integration. File conversion issues may arise with scanned or OCR-requiring documents.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

chathub by chathub-dev

0.1%
10k
All-in-one chatbot client
created 2 years ago
updated 4 months ago
Feedback? Help us improve.