NexusRAG  by LeDat98

Hybrid RAG system for intelligent document analysis and Q&A

Created 3 weeks ago

New!

257 stars

Top 98.2% on SourcePulse

GitHubView on GitHub
Project Summary

NexusRAG offers a sophisticated hybrid Retrieval-Augmented Generation (RAG) system designed to provide accurate, cited answers from documents. It targets engineers, researchers, and power users seeking advanced document understanding capabilities, integrating vector search, knowledge graphs, and visual intelligence for enhanced Q&A, agentic chat, and precise citations.

How It Works

NexusRAG implements a multi-stage pipeline that significantly enhances traditional RAG. It begins with advanced document parsing using Docling (default, preserving structure, math, layout, ~18-20GB VRAM) or Marker (lighter, better math, ~2-4GB VRAM) [https://github.com/docling-project/docling, https://github.com/datalab-to/marker]. These parsers extract images and tables, which are then LLM-captioned and embedded for searchability. The system employs dual embedding models: BAAI/bge-m3 for fast vector search and a separate model (Gemini, Ollama, or sentence-transformers) for richer knowledge graph (KG) enrichment. Retrieval is parallelized across vector search (ChromaDB), KG entity lookup (LightRAG), and precise cross-encoder reranking (BAAI/bge-reranker-v2-m3) [https://huggingface.co/BAAI/bge-m3, https://huggingface.co/BAAI/bge-reranker-v2-m3]. Answers are generated by synthesizing KG insights, cited chunks, and media, featuring auto-generated 4-character citations for precise source grounding and navigation.

Quick Start & Requirements

  • Primary install / run command:
    • Docker: docker compose up -d
    • Local Dev: ./setup.sh followed by ./run_bk.sh and ./run_fe.sh
  • Non-default prerequisites: Python 3.10+, Node.js 18+, Docker 20+. Google AI API key for Gemini, or Ollama for local models.
  • Estimated setup time or resource footprint: Initial Docker build ~5-10 minutes (~2.5GB model download). RAM: Min 4GB, Recommended 8GB+. Disk: Min 5GB, Recommended 10GB+.
  • Links: UI: http://localhost:5174, API Docs: http://localhost:8080/docs.

Highlighted Details

  • Hybrid Retrieval Pipeline: Integrates vector search, KG entity lookup, and cross-encoder reranking for superior accuracy.
  • Visual Document Intelligence: Captions images/tables via LLMs, embedding descriptions for vector search and surfacing them alongside text answers.
  • Advanced Document Parsers: Offers Docling (default, feature-rich, high VRAM) and Marker (lighter, better math/formulas) for robust structural preservation.
  • Multi-Provider LLM: Seamlessly integrates Google Gemini (cloud) and local Ollama models, supporting configurable thinking levels.
  • Agentic Streaming Chat: Real-time SSE chat with visual agent steps, extended reasoning display, and robust function calling.
  • Precise Citation System: Generates 4-character inline citations with page/heading details, enabling direct navigation within the document viewer.
  • Interactive Knowledge Graph: Visualizes extracted entities/relationships using LightRAG, supporting multi-hop queries and interactive exploration.

Maintenance & Community

No specific details on notable contributors, sponsorships, or community channels (e.g., Discord/Slack) are provided in the README.

Licensing & Compatibility

  • License type: MIT License.
  • Compatibility notes: The permissive MIT license generally allows for commercial use and integration with closed-source projects.

Limitations & Caveats

Evaluation revealed weaknesses in retrieval coverage, with some specific facts missed. Faithfulness was also a concern, as the LLM occasionally added unsupported details. Table parsing performance is model-dependent, and language consistency can be an issue with certain local models. The default Docling parser has a high VRAM requirement (~18-20GB) for its advanced features.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
4
Star History
259 stars in the last 27 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

RAG-Anything by HKUDS

2.8%
16k
All-in-one multimodal RAG system
Created 10 months ago
Updated 4 days ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
2 more.

LightRAG by HKUDS

2.5%
33k
RAG framework for fast, simple retrieval-augmented generation
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.