PIKE-RAG  by microsoft

RAG system for domain-specific knowledge extraction

Created 11 months ago
2,038 stars

Top 21.9% on SourcePulse

GitHubView on GitHub
Project Summary

PIKE-RAG addresses the limitations of standard Retrieval Augmented Generation (RAG) systems in industrial applications by incorporating domain-specific knowledge extraction and multi-step reasoning. It targets engineers and researchers needing to build more accurate and robust RAG solutions for complex tasks, offering improved accuracy and factual grounding.

How It Works

PIKE-RAG employs a modular framework that goes beyond simple embedding-based retrieval. It focuses on context-aware segmentation, automatic term alignment, and multi-granularity knowledge extraction to enhance the understanding and retrieval of domain-specific information. This approach aims to overcome challenges posed by specialized terminology and complex reasoning chains, leading to more reliable RAG performance.

Quick Start & Requirements

  • Install: Clone the repository and set up the Python environment.
  • Configuration: Create a .env file for endpoint information and modify YAML configuration files.
  • Resources: Refer to documentation for reproducing experiments on MuSiQue.
  • Links: 🌐Online Demo, 📊Technical Report, Documentation

Highlighted Details

  • Achieved 87.6% accuracy on HotpotQA, 82.0% on 2WikiMultiHopQA, and 59.6% on MuSiQue.
  • Demonstrated significant improvements in industrial manufacturing, mining, and pharmaceuticals.
  • Supports flexible pipeline customization for diverse industrial needs.

Maintenance & Community

  • Contributions are welcomed, requiring agreement with a Contributor License Agreement (CLA).
  • Adheres to the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

The README does not specify licensing details, which could impact commercial use or integration into closed-source projects. Further investigation into the project's licensing is recommended.

Health Check
Last Commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
94 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Joe Walnes Joe Walnes(Head of Experimental Projects at Stripe), and
1 more.

KAG by OpenSPG

0.4%
8k
Logical reasoning framework for domain knowledge bases
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.