UltraRAG  by OpenBMB

RAG framework for domain adaptation, streamlining data construction to model fine-tuning

Created 9 months ago
1,782 stars

Top 24.0% on SourcePulse

GitHubView on GitHub
Project Summary

UltraRAG is a comprehensive framework for building and optimizing Retrieval-Augmented Generation (RAG) systems, targeting researchers and developers. It offers a one-stop solution for automated knowledge adaptation, simplifying data construction, model fine-tuning, and inference evaluation, with a particular focus on domain-specific RAG applications.

How It Works

UltraRAG employs a modular architecture with three layers: Backend (components like knowledge base, retrieval, generation models), Workflow (standard RAG patterns and proprietary methods like Adaptive-Note, VisRAG), and Function (data synthesis, evaluation, fine-tuning). It supports microservice deployment for key services and provides a user-friendly frontend for resource management and function access. This layered approach allows for flexible customization and integration of cutting-edge RAG techniques.

Quick Start & Requirements

  • Deployment: Docker (docker-compose up --build -d) or Conda (conda create -n ultrarag python=3.10, conda activate ultrarag, pip install -r requirements.txt).
  • Prerequisites: CUDA 12.2+, Python 3.10+.
  • Model Download: python scripts/download_model.py.
  • WebUI: Access at http://localhost:8843.
  • Demo: streamlit run ultrarag/webui/webui.py --server.fileWatcherType none.
  • Documentation: User Guide, UltraRAG Series.

Highlighted Details

  • No-code WebUI for full-link RAG setup and optimization, including multimodal RAG (VisRAG).
  • One-click data construction and retrieval with proprietary methods like KBAlign and RAG-DDR.
  • Multidimensional, multi-stage evaluation using the RAGEval method for robust assessment.
  • Research-friendly integration of proprietary methods and support for module-level exploration.

Maintenance & Community

Developed by a collaboration including THUNLP (Tsinghua University) and NEUIR (Northeastern University). New contributors are welcome.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Permissive license suitable for commercial use and closed-source integration.

Limitations & Caveats

The framework is research-oriented and integrates several proprietary methods, which may require understanding their specific implementations for full utilization. Performance claims are based on specific domain evaluations (legal field).

Health Check
Last Commit

14 hours ago

Responsiveness

1 day

Pull Requests (30d)
15
Issues (30d)
4
Star History
111 stars in the last 30 days

Explore Similar Projects

Starred by Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
1 more.

AutoRAG by Marker-Inc-Korea

0.2%
4k
RAG AutoML tool for optimizing RAG pipelines
Created 1 year ago
Updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.1%
14k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.