UltraRAG  by OpenBMB

RAG framework for domain adaptation, streamlining data construction to model fine-tuning

created 6 months ago
728 stars

Top 48.5% on sourcepulse

GitHubView on GitHub
Project Summary

UltraRAG is a comprehensive framework for building and optimizing Retrieval-Augmented Generation (RAG) systems, targeting researchers and developers. It offers a one-stop solution for automated knowledge adaptation, simplifying data construction, model fine-tuning, and inference evaluation, with a particular focus on domain-specific RAG applications.

How It Works

UltraRAG employs a modular architecture with three layers: Backend (components like knowledge base, retrieval, generation models), Workflow (standard RAG patterns and proprietary methods like Adaptive-Note, VisRAG), and Function (data synthesis, evaluation, fine-tuning). It supports microservice deployment for key services and provides a user-friendly frontend for resource management and function access. This layered approach allows for flexible customization and integration of cutting-edge RAG techniques.

Quick Start & Requirements

  • Deployment: Docker (docker-compose up --build -d) or Conda (conda create -n ultrarag python=3.10, conda activate ultrarag, pip install -r requirements.txt).
  • Prerequisites: CUDA 12.2+, Python 3.10+.
  • Model Download: python scripts/download_model.py.
  • WebUI: Access at http://localhost:8843.
  • Demo: streamlit run ultrarag/webui/webui.py --server.fileWatcherType none.
  • Documentation: User Guide, UltraRAG Series.

Highlighted Details

  • No-code WebUI for full-link RAG setup and optimization, including multimodal RAG (VisRAG).
  • One-click data construction and retrieval with proprietary methods like KBAlign and RAG-DDR.
  • Multidimensional, multi-stage evaluation using the RAGEval method for robust assessment.
  • Research-friendly integration of proprietary methods and support for module-level exploration.

Maintenance & Community

Developed by a collaboration including THUNLP (Tsinghua University) and NEUIR (Northeastern University). New contributors are welcome.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Permissive license suitable for commercial use and closed-source integration.

Limitations & Caveats

The framework is research-oriented and integrates several proprietary methods, which may require understanding their specific implementations for full utilization. Performance claims are based on specific domain evaluations (legal field).

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
96 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.