fully-local-pdf-chatbot  by jacoblee93

Local PDF chatbot for document interaction

created 1 year ago
1,782 stars

Top 24.7% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a fully local, client-side chat-over-documents solution, targeting users who want to query PDFs without uploading data to external servers. It leverages WebAssembly and browser-based LLM inference for privacy and offline functionality.

How It Works

The application processes uploaded PDFs entirely within the browser. It chunks the document, creates vector embeddings using Transformers.js (or optionally Ollama), and stores them in a WASM-based vector store (Voy). Retrieval-Augmented Generation (RAG) is then performed using LangChain.js and LangGraph.js, interacting with a locally hosted LLM.

Quick Start & Requirements

  • In-browser (WebLLM): Upload PDF directly; model weights (Phi-3.5) download on first use (several GB).
  • Ollama: Requires Ollama desktop app and a running Mistral instance.
    • OLLAMA_ORIGINS=https://webml-demo.vercel.app OLLAMA_HOST=127.0.0.1:11435 ollama serve
    • OLLAMA_HOST=127.0.0.1:11435 ollama pull mistral
  • Gemini Nano: Requires Chrome early preview program.
  • Dependencies: Node.js (yarn).

Highlighted Details

  • Fully client-side RAG pipeline.
  • Supports multiple local LLM backends: Ollama, WebLLM, Gemini Nano.
  • Utilizes Voy (WASM vector store) and Transformers.js for embeddings.
  • Orchestration via LangChain.js and LangGraph.js.

Maintenance & Community

The project acknowledges contributors from Voy, Ollama, WebLLM, and Transformers.js. The author is active on Twitter (@Hacubu).

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The Gemini Nano integration is experimental and may yield variable results as the model is not chat-tuned. The project is a Next.js app, and deployment details beyond local setup are not provided.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
34 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Simon Willison Simon Willison(Author of Django), and
1 more.

Lumos by andrewnguonly

0.1%
2k
Chrome extension for local LLM web RAG co-piloting
created 1 year ago
updated 6 months ago
Feedback? Help us improve.