KG-RAG empowers LLMs using knowledge graphs for knowledge-intensive tasks
Top 42.2% on sourcepulse
This framework empowers Large Language Models (LLMs) for knowledge-intensive tasks by integrating explicit knowledge from a biomedical Knowledge Graph (KG) with the implicit knowledge of LLMs, offering "prompt-aware context" for improved accuracy. It is designed for researchers and developers working with biomedical data and LLMs.
How It Works
KG-RAG combines a massive biomedical KG (SPOKE, with 27M nodes and 53M edges) with LLMs like GPT and Llama. It extracts "prompt-aware context"—the minimal, relevant information from the KG needed to answer a user's query. This approach optimizes domain-specific context for general-purpose LLMs, enhancing their performance on knowledge-intensive tasks.
Quick Start & Requirements
pip install -r requirements.txt
.config.yaml
must be updated. Optional Llama model download.python -m kg_rag.run_setup
to create a disease vector database and optionally download the Llama model.GPT_API_TYPE='openai' python -m kg_rag.rag_based_generation.GPT.text_generation -g
for GPT or python -m kg_rag.rag_based_generation.Llama.text_generation -m <method>
for Llama. Interactive modes are available (-i True
).Highlighted Details
-e
).Maintenance & Community
The project is associated with BaranziniLab. The primary citation is Soman et al., 2023.
Licensing & Compatibility
The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Currently designed primarily for disease-related prompts, with ongoing work to improve versatility. The setup script may download large models if not already present.
8 months ago
1 week