HiRAG by hhy-huang

Retrieval-Augmented Generation with Hierarchical Knowledge

Created 10 months ago

493 stars

Top 62.8% on SourcePulse

Project Summary

HiRAG addresses the challenge of improving retrieval-augmented generation (RAG) by incorporating hierarchical knowledge. It is designed for researchers and developers working with large language models who need to enhance the accuracy and comprehensiveness of generated text by providing more structured and relevant information during the retrieval process. The primary benefit is a significant improvement in response quality across various metrics compared to existing RAG methods.

How It Works

HiRAG implements a hierarchical retrieval mechanism that organizes knowledge into a tree-like structure. This allows the model to first retrieve broader, high-level information and then progressively drill down into more specific details. This approach is advantageous because it mimics human cognitive processes for information retrieval, leading to more contextually relevant and accurate results. The hierarchical structure helps in disambiguating information and providing more focused answers, especially for complex queries.

Quick Start & Requirements

Install: pip install -e . (after cloning the repository)
Prerequisites: Python, and potentially specific LLM configurations (e.g., DeepSeek, ChatGLM, OpenAI) and API keys as detailed in ./config.yaml.
Usage: The README provides Python code snippets for initializing HiRAG, inserting context, and performing queries with hierarchical retrieval. Examples for integrating with third-party retrieval APIs are also available in the ./ directory.

Highlighted Details

Achieves significantly higher scores across Comprehensiveness, Empowerment, and Diversity metrics compared to Naive RAG, GraphRAG, LightRAG, FastGraphRAG, and KAG.
Demonstrates near-perfect scores (e.g., 99.2% on Comprehensiveness for the Mix dataset) when compared to FastGraphRAG.
Supports various retrieval modes, including hierarchical, naive, and combinations of local/global/bridge knowledge.
The evaluation framework allows for testing with different datasets from Hugging Face and various LLM backends.

Maintenance & Community

The project is associated with the paper "Retrieval-Augmented Generation with Hierarchical Knowledge" accepted to EMNLP 2025 Findings.
Acknowledgements mention the use of open-source projects like nano-graphrag and RAPTOR.
Citation details are provided for the associated paper.

Licensing & Compatibility

The README does not explicitly state a license. Further clarification on licensing and compatibility for commercial or closed-source use would be necessary.

Limitations & Caveats

The README does not specify any explicit limitations or known issues. However, the absence of a stated license could be a significant adoption blocker for commercial applications. The setup might also require careful configuration of API keys and LLM parameters.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

1

Star History

15 stars in the last 30 days

Explore Similar Projects

RAG-QA-Generator by wangxb96

Automated RAG knowledge base generation and management tool

Created 1 year ago

Updated 1 year ago

awesome-rag by coree

Curated list of resources for retrieval-augmented generation (RAG) in LLMs

Created 1 year ago

Updated 1 month ago

legit-rag by Emissary-Tech

Modular RAG system for production use

Created 11 months ago

Updated 10 months ago

CRAG by HuskyInSalt

RAG method for improved generation robustness

Created 1 year ago

Updated 1 year ago

RAG-Interview-Questions-and-Answers-Hub by KalyanKS-NLP

RAG knowledge hub for technical interviews

Created 3 weeks ago

Updated 3 weeks ago

Starred by

Omar Khattab

Omar Khattab(Coauthor of DSPy, ColBERT; Professor at MIT).

Rankify by DataScienceUIBK

Python toolkit for retrieval, re-ranking, and RAG research

Created 11 months ago

Updated 2 months ago

Awesome-LLM-RAG by jxzhangjhu

Curated list of papers on retrieval augmented generation (RAG) in LLMs

Created 2 years ago

Updated 10 months ago

rag-from-scratch by pguso

Building Retrieval-Augmented Generation (RAG) from scratch

Created 2 months ago

Updated 1 month ago

Starred by

Li Jiang

Li Jiang(Coauthor of AutoGen; Engineer at Microsoft).

TrustRAG by gomate-community

RAG framework for reliable input, trusted output

Created 1 year ago

Updated 4 days ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

1 more.

RAG-Survey by Tongji-KGLLM

RAG survey and knowledge base

Created 2 years ago

Updated 1 year ago

Chinese-LangChain by yanqiangmiffy

Gradio SDK for local knowledge base QA using ChatGLM-6B + LangChain

Created 2 years ago

Updated 2 years ago

Starred by

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind),

Yiran Wu

Yiran Wu(Coauthor of AutoGen), and

2 more.

RAG_Techniques by NirDiamant

RAG techniques showcase for enhanced generation systems

Created 1 year ago

Updated 1 month ago

Feedback? Help us improve.