PIKE-RAG  by microsoft

RAG system for domain-specific knowledge extraction

Created 1 year ago
2,368 stars

Top 18.9% on SourcePulse

GitHubView on GitHub
Project Summary

PIKE-RAG addresses the limitations of standard Retrieval Augmented Generation (RAG) systems in industrial applications by incorporating domain-specific knowledge extraction and multi-step reasoning. It targets engineers and researchers needing to build more accurate and robust RAG solutions for complex tasks, offering improved accuracy and factual grounding.

How It Works

PIKE-RAG employs a modular framework that goes beyond simple embedding-based retrieval. It focuses on context-aware segmentation, automatic term alignment, and multi-granularity knowledge extraction to enhance the understanding and retrieval of domain-specific information. This approach aims to overcome challenges posed by specialized terminology and complex reasoning chains, leading to more reliable RAG performance.

Quick Start & Requirements

  • Install: Clone the repository and set up the Python environment.
  • Configuration: Create a .env file for endpoint information and modify YAML configuration files.
  • Resources: Refer to documentation for reproducing experiments on MuSiQue.
  • Links: 🌐Online Demo, 📊Technical Report, Documentation

Highlighted Details

  • Achieved 87.6% accuracy on HotpotQA, 82.0% on 2WikiMultiHopQA, and 59.6% on MuSiQue.
  • Demonstrated significant improvements in industrial manufacturing, mining, and pharmaceuticals.
  • Supports flexible pipeline customization for diverse industrial needs.

Maintenance & Community

  • Contributions are welcomed, requiring agreement with a Contributor License Agreement (CLA).
  • Adheres to the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

The README does not specify licensing details, which could impact commercial use or integration into closed-source projects. Further investigation into the project's licensing is recommended.

Health Check
Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Joe Walnes Joe Walnes(Head of Experimental Projects at Stripe), and
1 more.

KAG by OpenSPG

0.1%
9k
Logical reasoning framework for domain knowledge bases
Created 1 year ago
Updated 4 weeks ago
Feedback? Help us improve.