tiny-rag by wdndev

Tiny RAG system for retrieval-augmented LLM

Created 1 year ago

334 stars

Top 82.2% on SourcePulse

Project Summary

This project provides a minimal Retrieval-Augmented Generation (RAG) system designed for researchers and developers needing a lightweight, modular framework. It simplifies the process of integrating external knowledge into LLMs for improved response accuracy and relevance, supporting various document types and multiple embedding and LLM backends.

How It Works

The system employs a multi-stage approach: document parsing and embedding, offline database construction, and online retrieval with re-ranking. It supports text and image embeddings using models like BGE and CLIP, respectively. For retrieval, it combines BM25 keyword search with vector similarity search using FAISS. A re-ranking model then refines the retrieved results before they are passed to the LLM, enhancing the quality of context provided to the language model.

Quick Start & Requirements

Install/Run: python script/tiny_rag.py -t build -c config/qwen2_config.json -p data/raw_data/wikipedia-cn-20230720-filtered.json (for building DB) and python script/tiny_rag.py -t search -c config/qwen2_config.json (for searching).
Prerequisites: Python, FAISS, BGE, CLIP, Qwen2 (or other supported LLMs). Specific models need to be downloaded separately as indicated in the README.
Setup: Requires downloading multiple models. Configuration is managed via JSON files.

Highlighted Details

Supports multiple document parsing formats (txt, markdown, pdf, word, ppt, images).
Implements dual-path retrieval (BM25 and vector similarity) with a re-ranking stage.
Offers modularity for custom embedding and LLM integrations by inheriting base classes.
Includes support for both local LLM inference and API-based models.

Maintenance & Community

The project appears to be a personal or small-team effort with no explicit mention of maintainers, community channels (like Discord/Slack), or a public roadmap.

Licensing & Compatibility

The README does not explicitly state a license. The code structure and dependencies suggest it is intended for research and development purposes. Commercial use would require clarification on licensing.

Limitations & Caveats

The project uses smaller models for demonstration, and larger models are recommended for better performance. The README does not detail error handling, scalability considerations, or specific performance benchmarks. The lack of explicit community support or clear licensing might be a concern for production deployments.

tiny-rag by wdndev

Explore Similar Projects

vectordb by kagisearch

VARAG by adithya-s-k

TinyRAG by KMnO4-zx

ai-template by Jordan-Gilliam

layra by liweiphys

TrustRAG by gomate-community

localGPT-Vision by PromtEngineer

Local_Pdf_Chat_RAG by weiwill88

clip-retrieval by rom1504

LangChain-ChatGLM-Webui by X-D-Lab

WeKnora by Tencent

LightRAG by HKUDS