kotaemon  by Cinnamon

Open-source RAG UI for chatting with documents, targeting both end-users and developers

Created 1 year ago
24,273 stars

Top 1.6% on SourcePulse

GitHubView on GitHub
Project Summary

Kotaemon provides an open-source, customizable RAG UI for document-based question answering, targeting both end-users seeking a chat interface for their documents and developers building RAG pipelines. It offers a clean UI, supports various LLMs (API-based and local via Ollama/llama-cpp-python), and facilitates RAG pipeline development with features like hybrid retrieval, multi-modal QA, and advanced citations.

How It Works

Kotaemon implements a RAG pipeline with a hybrid retriever combining full-text and vector search, augmented by re-ranking for improved retrieval quality. It supports complex reasoning methods like question decomposition and agent-based reasoning (ReAct, ReWOO). The architecture is built on Gradio, allowing for a customizable UI and extensible RAG pipeline strategies, including GraphRAG indexing.

Quick Start & Requirements

  • Install: Via Docker (recommended) or from source (pip install -e "libs/kotaemon[all]", pip install -e "libs/ktem").
  • Prerequisites: Python >= 3.10. Docker is optional but recommended. unstructured library is needed for processing file types beyond .pdf, .html, .mhtml, and .xlsx.
  • Resources: Live demos available at Hugging Face Spaces. User and Developer guides are linked.
  • Docker Images: ghcr.io/cinnamon/kotaemon:main-full, ghcr.io/cinnamon/kotaemon:main-ollama, ghcr.io/cinnamon/kotaemon:main-lite. Supports linux/amd64 and linux/arm64.

Highlighted Details

  • Supports multi-user login, private/public collections, and shared chats.
  • Offers advanced citations with in-browser PDF preview and highlights.
  • Integrates with external RAG frameworks like NanoGraphRAG and LightRAG.
  • Features configurable settings UI for retrieval and generation parameters.

Maintenance & Community

The project is actively developed by "The Kotaemon Team." Feedback and contributions are welcomed via GitHub issues and a contributing guide.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Installation of optional dependencies like unstructured or specific RAG integrations (e.g., nano-graphrag, LightRAG) may introduce Python package version conflicts, requiring manual resolution. Official MS GraphRAG indexing is limited to OpenAI or Ollama APIs.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
5
Star History
1,398 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zack Li Zack Li(Cofounder of Nexa AI), and
12 more.

search_with_lepton by leptonai

0.0%
8k
Conversational search engine demo
Created 1 year ago
Updated 2 weeks ago
Feedback? Help us improve.