Easy-RAG  by yuntianhe2014

RAG system for learning, usage, and extension

created 1 year ago
502 stars

Top 62.8% on sourcepulse

GitHubView on GitHub
Project Summary

Easy-RAG is an open-source Retrieval Augmented Generation (RAG) system designed for learning, usage, and extensibility, enabling AI-powered web searches. It caters to users who want to build and customize RAG pipelines, offering features for knowledge base management, multi-turn chat, and internet-based AI search.

How It Works

The system supports various data formats (txt, csv, pdf, docx, mp3, mp4, wav, excel) for knowledge base creation, updating, and deletion. It leverages vectorization with support for Chroma, FAISS, and Elasticsearch. For chat, it offers multi-turn conversations with LLMs and RAG-based Q&A using different retrieval methods including reranking with the BGE-reranker-large model. AI web search is integrated via SearxNG, allowing LLMs to query the internet. Audio/video processing uses funasr for speech-to-text.

Quick Start & Requirements

  • Install: pip3 install -r requirements.txt
  • Prerequisites: Ollama (for LLMs and embeddings), SearxNG (for web search), Python 3.8+ (tested with 3.10.9), BGE-reranker-large model download.
  • Setup: Requires Ollama model downloads (ollama run qwen2:7b, ollama run mofanke/acge_text_embedding:latest), SearxNG setup (Docker recommended), and configuration of vector database and reranker paths in Config/config.py.
  • Run: python webui.py
  • Docs: https://github.com/yuntianhe2014/Easy-RAG

Highlighted Details

  • Supports multiple vector databases: Chroma, FAISS, Elasticsearch (with plans for Milvus, MongoDB).
  • Integrates AI web search via SearxNG for internet-connected AI queries.
  • Includes speech-to-text for audio/video files using funasr.
  • Implements reranking with BGE-reranker-large for improved retrieval efficiency.
  • Offers a separate tool for real-time knowledge graph extraction.

Maintenance & Community

The project is actively updated, with recent additions including AI web search, Elasticsearch support, FAISS support, and reranking. Future plans include more vector database integrations and voice output.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The initial release focuses on specific vector databases and does not include all planned integrations. The README mentions that funasr model downloads might be slow on first startup. The licensing status requires clarification for commercial adoption.

Health Check
Last commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
24 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.