raptor  by parthsarthi03

Retrieval-augmented language model research paper

Created 1 year ago
1,413 stars

Top 28.8% on SourcePulse

GitHubView on GitHub
Project Summary

RAPTOR offers a novel retrieval-augmented generation (RAG) approach by building a recursive tree structure from documents, enabling more efficient and context-aware information retrieval. It is designed for researchers and developers working with large text corpora who need to improve the accuracy and relevance of language model responses.

How It Works

RAPTOR constructs a hierarchical tree of summaries from input documents. It recursively summarizes chunks of text, then summarizes those summaries, creating an abstractive tree. This structure allows for targeted retrieval of relevant information by traversing the tree, leading to more precise answers from language models. The framework is extensible, allowing users to integrate custom summarization, question-answering, and embedding models.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires Python 3.8+
  • Requires an OpenAI API key set as an environment variable (OPENAI_API_KEY).
  • See demo.ipynb for examples with custom models.

Highlighted Details

  • Implements a recursive summarization strategy to build a tree-like document index.
  • Supports integration of custom summarization, QA, and embedding models (e.g., Llama, Mistral, SBERT).
  • Allows saving and loading of the constructed document tree for persistence.
  • Cited at ICLR 2024.

Maintenance & Community

The project is the official implementation of the RAPTOR paper, co-authored by Christopher D. Manning. Further examples and configuration guides are planned.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The project is marked as "Work in Progress" (WIP) with forthcoming documentation and advanced features. Initial setup requires an OpenAI API key, and custom model integration details are still being developed.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
60 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.