RAG research framework for efficient generative pipelines
Top 26.5% on sourcepulse
fastRAG is a research framework for building efficient retrieval-augmented generation (RAG) pipelines, targeting researchers and developers. It aims to advance RAG by providing optimized components and state-of-the-art LLMs and information retrieval techniques, enabling greater compute efficiency.
How It Works
fastRAG leverages the Haystack and HuggingFace ecosystems, offering full compatibility with Haystack v2+. Its core advantage lies in its optimized components, including efficient bi-encoders, sparse cross-encoders, ColBERT for token-based late interaction, Fusion-in-Decoder (FiD), REPLUG, and the PLAID indexing engine. It also provides backend support for various LLM execution environments, including Intel Gaudi accelerators, ONNX Runtime, OpenVINO, and Llama-CPP.
Quick Start & Requirements
pip install fastrag
fastrag[intel]
, fastrag[openvino]
, fastrag[qdrant]
, fastrag[colbert]
, fastrag[faiss-cpu]
, fastrag[faiss-gpu]
.pip install .
.Highlighted Details
Maintenance & Community
This is a research framework from Intel Labs. Comments, suggestions, issues, and pull requests are welcomed.
Licensing & Compatibility
Licensed under the Apache 2.0 License. This is not an official Intel product.
Limitations & Caveats
The framework is research-oriented, and users should be aware of potential changes and the need to report issues, especially with the recent Haystack v2+ compatibility.
6 months ago
1 week