tinyRAG  by phbst

DIY RAG framework for learning

Created 1 year ago
251 stars

Top 99.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a foundational, hand-crafted implementation of Retrieval-Augmented Generation (RAG) designed for educational purposes. It targets engineers and researchers seeking a deep understanding of RAG's core principles, offering a transparent alternative to complex libraries like Langchain or LlamaIndex. The primary benefit is demystifying RAG's mechanics, enabling users to experiment and grasp concepts like indexing, retrieval, and generation from first principles.

How It Works

The system follows a three-step RAG process: Indexing, Retrieval, and Generation. First, documents are chunked into smaller segments and converted into vector embeddings using a chosen encoder model. Second, user queries are embedded, and a similarity search is performed against the vector index to retrieve relevant document chunks. Finally, these retrieved contexts are used to condition a Large Language Model (LLM) to generate an informed and accurate answer, mitigating issues like hallucinations and outdated information. This manual approach prioritizes clarity of algorithm over library convenience.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.10 or higher.
  • Dependencies: Requires selection and setup of embedding models (e.g., Zhipuembedding, OpenAIembedding, HFembedding, Jinaembedding). PDF processing relies on PyPDF2. A Gradio-based web demo is included (webdemo_by_gradio.ipynb).
  • Resources: Local vector database persistence is supported (db directory).
  • Docs: Detailed implementation insights are available via a linked blog post: https://zhuanlan.zhihu.com/p/688842148

Highlighted Details

  • Emphasizes a "hand-crafted" implementation to facilitate learning RAG principles, contrasting with higher-level abstractions in popular frameworks.
  • Offers flexibility in embedding models, supporting Chinese text with ZhipuEmbedding and English with OpenAI or Huggingface options.
  • Modular project structure (component directory) separates concerns like data chunking, embedding, databases, and LLM integration.

Maintenance & Community

No specific information regarding maintainers, community channels (e.g., Discord, Slack), sponsorships, or a public roadmap is provided in the README.

Licensing & Compatibility

The repository's license is not specified in the provided README. This lack of clarity may pose restrictions on commercial use or integration into closed-source projects.

Limitations & Caveats

The current implementation exhibits limitations in accurately retrieving information when key details are split across document chunks, suggesting potential issues with chunk size and overlap strategies. As a learning-focused, manually implemented system, it may lack the robustness, scalability, and comprehensive features required for production-grade RAG applications compared to established libraries.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.