FlashRAG  by RUC-NLPIR

Python toolkit for efficient RAG research

created 1 year ago
2,575 stars

Top 18.6% on sourcepulse

GitHubView on GitHub
Project Summary

FlashRAG is a Python toolkit designed for efficient Retrieval-Augmented Generation (RAG) research, enabling users to reproduce and develop RAG systems. It offers a comprehensive framework with 36 pre-processed benchmark datasets and 17 state-of-the-art RAG algorithms, catering to researchers and developers in the RAG domain.

How It Works

FlashRAG provides a modular architecture with components for retrievers, rerankers, generators, and compressors, allowing flexible pipeline assembly. It supports various retrieval methods (dense and sparse) using Faiss and Pyserini/bm25s, and integrates with LLM acceleration tools like vLLM and FastChat. The toolkit simplifies RAG workflow preparation through efficient preprocessing scripts and offers an easy-to-use UI for configuration and experimentation.

Quick Start & Requirements

  • Installation: pip install flashrag-dev --pre or clone and pip install -e .
  • Dependencies: Python 3.10+. Optional: vllm, sentence-transformers, pyserini. faiss-cpu or faiss-gpu requires Conda installation.
  • Resources: Supports GPU acceleration via vLLM. Index building can be resource-intensive depending on corpus size and retrieval method.
  • Links: Installation, Quick Start, FlashRAG-UI

Highlighted Details

  • Includes 36 pre-processed RAG benchmark datasets and 17 implemented SOTA RAG algorithms.
  • Supports multimodal RAG with MLLMs like LLaVA and multimodal retrievers.
  • Offers FlashRAG-UI, a visual interface for easy configuration, experimentation, and evaluation.
  • Provides optimized execution with vLLM, FastChat, and Faiss.

Maintenance & Community

The project is under active development, with a roadmap indicating plans to include more RAG approaches and evaluation metrics. Contributions are welcomed.

Licensing & Compatibility

FlashRAG is licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The toolkit is still under development, with some features planned for future releases. While efforts are made to reproduce original method results, uniform settings may lead to variations compared to original outcomes. Faiss installation can be challenging on certain systems.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
3
Issues (30d)
11
Star History
358 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.