Discover and explore top open-source AI tools and projects—updated daily.
LeDat98Hybrid RAG system for intelligent document analysis and Q&A
New!
Top 98.2% on SourcePulse
NexusRAG offers a sophisticated hybrid Retrieval-Augmented Generation (RAG) system designed to provide accurate, cited answers from documents. It targets engineers, researchers, and power users seeking advanced document understanding capabilities, integrating vector search, knowledge graphs, and visual intelligence for enhanced Q&A, agentic chat, and precise citations.
How It Works
NexusRAG implements a multi-stage pipeline that significantly enhances traditional RAG. It begins with advanced document parsing using Docling (default, preserving structure, math, layout, ~18-20GB VRAM) or Marker (lighter, better math, ~2-4GB VRAM) [https://github.com/docling-project/docling, https://github.com/datalab-to/marker]. These parsers extract images and tables, which are then LLM-captioned and embedded for searchability. The system employs dual embedding models: BAAI/bge-m3 for fast vector search and a separate model (Gemini, Ollama, or sentence-transformers) for richer knowledge graph (KG) enrichment. Retrieval is parallelized across vector search (ChromaDB), KG entity lookup (LightRAG), and precise cross-encoder reranking (BAAI/bge-reranker-v2-m3) [https://huggingface.co/BAAI/bge-m3, https://huggingface.co/BAAI/bge-reranker-v2-m3]. Answers are generated by synthesizing KG insights, cited chunks, and media, featuring auto-generated 4-character citations for precise source grounding and navigation.
Quick Start & Requirements
docker compose up -d./setup.sh followed by ./run_bk.sh and ./run_fe.shhttp://localhost:5174, API Docs: http://localhost:8080/docs.Highlighted Details
Maintenance & Community
No specific details on notable contributors, sponsorships, or community channels (e.g., Discord/Slack) are provided in the README.
Licensing & Compatibility
Limitations & Caveats
Evaluation revealed weaknesses in retrieval coverage, with some specific facts missed. Faithfulness was also a concern, as the LLM occasionally added unsupported details. Table parsing performance is model-dependent, and language consistency can be an issue with certain local models. The default Docling parser has a high VRAM requirement (~18-20GB) for its advanced features.
1 week ago
Inactive
HKUDS