Discover and explore top open-source AI tools and projects—updated daily.
darrencxl0301Hallucination-resistant RAG framework
Top 64.5% on SourcePulse
Summary
StageRAG is a production-ready framework for building hallucination-resistant Retrieval Augmented Generation (RAG) systems. It offers precise control over the speed-versus-accuracy trade-off, enabling high-factuality applications by managing LLM response uncertainty. The target audience includes engineers and researchers developing robust RAG solutions.
How It Works
The framework employs dual-mode pipelines: a 3-step "Speed" mode using 1B and 3B models for rapid responses (~3-5s), and a 4-step "Precision" mode with a 3B model for deeper analysis (~6-12s). It integrates user knowledge bases via JSONL, automatically building vector indices. A novel multi-component confidence scoring system evaluates retrieval quality, answer structure, relevance, and uncertainty, allowing programmatic handling of low-confidence outputs to mitigate hallucinations. The system is optimized for smaller Llama 3.2 models with 4-bit quantization, requiring minimal GPU memory.
Quick Start & Requirements
pip install huggingface-hub, huggingface-cli login).cd StageRAG, pip install -r requirements.txt, pip install -e ., python setup.py.python scripts/download_data.py or provide a custom JSONL file.python demo/interactive_demo.py --rag_dataset data/data.jsonl [--use_4bit --device cuda]. Programmatic use via from stagerag import StageRAGSystem.Highlighted Details
Maintenance & Community
Contributions are welcomed via standard GitHub pull requests. The primary contact is Darren Chai Xin Lun (@darrencxl0301 on GitHub and HuggingFace). Key dependencies include Llama 3.2 models, FAISS, and Sentence Transformers.
Licensing & Compatibility
This project is released under the MIT License, which is permissive for commercial use and integration into closed-source applications.
Limitations & Caveats
Users must obtain explicit access approval for the gated Llama 3.2 models from HuggingFace, which is a mandatory prerequisite. While CPU support exists, a CUDA-capable GPU is recommended for optimal performance.
3 weeks ago
Inactive
Marker-Inc-Korea