ragflow-upload  by Samge0

CLI tool for RagFlow knowledge base automation

Created 1 year ago
483 stars

Top 63.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a Python script to automate the batch uploading and parsing of documents into a RagFlow knowledge base. It addresses the limitations of RagFlow's default interface, which requires manual, sequential uploads and parsing, by offering a streamlined, automated process for handling large volumes of documents. This is particularly beneficial for users needing to ingest extensive datasets, such as personal notes or large collections of research papers, into their LLM-based Q&A systems.

How It Works

The script iterates through a specified directory, uploading and initiating the parsing process for each document individually. It ensures that the next document is processed only after the current one has finished parsing, thereby minimizing manual intervention and reducing the overall time required for large-scale data ingestion. This sequential, automated workflow directly tackles the inefficiency of RagFlow's manual batching and parsing.

Quick Start & Requirements

  • Install:
    • Create environment: conda create -n ragflow-upload python=3.10.13 -y
    • Install dependencies: pip install -r requirements.txt
  • Configuration: Copy and configure ragflows/configs.demo.py to ragflows/configs.py. Refer to issues #2 for configuration details.
  • Run: python ragflows/main.py
  • Prerequisites: Python 3.10.13, Conda environment management.

Highlighted Details

  • Automates sequential upload and parsing of documents into RagFlow.
  • Reduces manual intervention for large document sets.
  • Designed to improve efficiency for users with extensive data ingestion needs.

Maintenance & Community

No specific information on contributors, sponsorships, or community channels (like Discord/Slack) is provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project does not specify a license, which may impact commercial adoption. Information regarding error handling, supported document types beyond what RagFlow handles, or specific performance benchmarks is not detailed in the README.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
1 more.

AutoRAG by Marker-Inc-Korea

0.1%
5k
RAG AutoML tool for optimizing RAG pipelines
Created 2 years ago
Updated 2 weeks ago
Feedback? Help us improve.