tinyRAG by phbst

DIY RAG framework for learning

Created 2 years ago

265 stars

Top 96.3% on SourcePulse

Project Summary

This project provides a foundational, hand-crafted implementation of Retrieval-Augmented Generation (RAG) designed for educational purposes. It targets engineers and researchers seeking a deep understanding of RAG's core principles, offering a transparent alternative to complex libraries like Langchain or LlamaIndex. The primary benefit is demystifying RAG's mechanics, enabling users to experiment and grasp concepts like indexing, retrieval, and generation from first principles.

How It Works

The system follows a three-step RAG process: Indexing, Retrieval, and Generation. First, documents are chunked into smaller segments and converted into vector embeddings using a chosen encoder model. Second, user queries are embedded, and a similarity search is performed against the vector index to retrieve relevant document chunks. Finally, these retrieved contexts are used to condition a Large Language Model (LLM) to generate an informed and accurate answer, mitigating issues like hallucinations and outdated information. This manual approach prioritizes clarity of algorithm over library convenience.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.10 or higher.
Dependencies: Requires selection and setup of embedding models (e.g., Zhipuembedding, OpenAIembedding, HFembedding, Jinaembedding). PDF processing relies on PyPDF2. A Gradio-based web demo is included (webdemo_by_gradio.ipynb).
Resources: Local vector database persistence is supported (db directory).
Docs: Detailed implementation insights are available via a linked blog post: https://zhuanlan.zhihu.com/p/688842148

Highlighted Details

Emphasizes a "hand-crafted" implementation to facilitate learning RAG principles, contrasting with higher-level abstractions in popular frameworks.
Offers flexibility in embedding models, supporting Chinese text with ZhipuEmbedding and English with OpenAI or Huggingface options.
Modular project structure (component directory) separates concerns like data chunking, embedding, databases, and LLM integration.

Maintenance & Community

No specific information regarding maintainers, community channels (e.g., Discord, Slack), sponsorships, or a public roadmap is provided in the README.

Licensing & Compatibility

The repository's license is not specified in the provided README. This lack of clarity may pose restrictions on commercial use or integration into closed-source projects.

Limitations & Caveats

The current implementation exhibits limitations in accurately retrieving information when key details are split across document chunks, suggesting potential issues with chunk size and overlap strategies. As a learning-focused, manually implemented system, it may lack the robustness, scalability, and comprehensive features required for production-grade RAG applications compared to established libraries.

tinyRAG by phbst

Explore Similar Projects

RAG-QA-Generator by wangxb96

RAG-Book by Nipi64310

rag-all-in-one by lehoanglong95

HiRAG by hhy-huang

Awesome-LLM-RAG by jxzhangjhu

MasteringRAG by Steven-Luo

RAG-Interview-Questions-and-Answers-Hub by KalyanKS-NLP

RAG-Survey by hymie122

rag-from-scratch by pguso

WeKnora by Tencent

RAG_Techniques by NirDiamant

ragflow by infiniflow