gurubase  by Gurubase

Open-source RAG system for AI-powered Q&A assistants

created 8 months ago
718 stars

Top 48.9% on sourcepulse

GitHubView on GitHub
Project Summary

Gurubase provides an open-source Retrieval Augmented Generation (RAG) system for creating AI-powered Q&A assistants, or "Gurus," from various data sources. It targets developers and organizations looking to enhance their documentation, support, or knowledge bases with an "Ask AI" feature, offering instant, referenced answers and reducing hallucinations.

How It Works

Gurubase employs a RAG architecture involving indexing, embedding, and retrieval. Data sources (web pages, PDFs, YouTube, GitHub repos, Jira, Zendesk) are processed and chunked, then converted into vector embeddings stored in Milvus. When a question is asked, relevant context is retrieved from Milvus, and an LLM generates an answer, with an evaluation mechanism to minimize hallucinations. This approach allows for accurate, context-aware responses grounded in the provided data.

Quick Start & Requirements

  • Install via curl -fsSL https://raw.githubusercontent.com/Gurubase/gurubase/refs/heads/master/gurubase.sh -o gurubase.sh && bash gurubase.sh.
  • Detailed instructions are in INSTALL.md.
  • Minimum requirements: 4 CPU cores, 8GB RAM, 10GB SSD storage, Linux or macOS (WShell2 for Windows).
  • See Gurubase Documentation for more.

Highlighted Details

  • Supports web pages, PDFs, YouTube videos, GitHub repositories, Jira issues, and Zendesk tickets/articles as data sources.
  • Offers integrations via a website widget, Slack bot, Discord bot, and GitHub bot.
  • Features "Binge" for visualizing and navigating learning paths.
  • Provides a self-hosted option for full deployment control.
  • Microservices architecture: Next.js frontend, Django backend, Milvus vector store, RabbitMQ, Redis, PostgreSQL.

Maintenance & Community

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Content generated aligns with the license of the datasources used.

Limitations & Caveats

Currently, only the Gurubase team can create new Gurus on Gurubase.io; users must submit requests via GitHub issues. While self-hosting is supported, the initial quick install script is a basic entry point, with detailed setup and upgrades managed via INSTALL.md. Periodic reindexing for all data sources is planned but not yet implemented.

Health Check
Last commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
51 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.1%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 2 days ago
Feedback? Help us improve.