llm-rag  by labdmitriy

RAG implementation from scratch, based on Lance Martin's series

created 4 months ago
302 stars

Top 89.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive environment for experimenting with and implementing advanced Retrieval-Augmented Generation (RAG) techniques. It is designed for researchers and developers looking to explore various query translation, indexing, retrieval, and generation strategies to improve LLM responses with external knowledge.

How It Works

The project offers a modular approach to RAG, breaking down complex workflows into distinct components. It leverages LangGraph for orchestrating multi-agent RAG pipelines and provides implementations for cutting-edge techniques like Multi-Query, RAG-Fusion, Decomposition, Step-Back Prompting, HyDE, and various multi-representation indexing methods (e.g., RAPTOR, ColBERT). The architecture allows for easy swapping and combination of these components to build custom RAG systems.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt and pip install -e . (or .[ragatouille] for ColBERT support). Alternatively, use uv sync --group dev and uv pip install -e ..
  • Prerequisites: Python 3.x. Specific models or advanced features might require additional dependencies as detailed in the notebooks.
  • Resources: Setup involves installing Python packages. Running specific RAG pipelines may require significant computational resources depending on the models and data used.
  • Documentation: The project is heavily documented through a series of Jupyter Notebooks, with corresponding Python scripts and video explanations available. YouTube Playlist

Highlighted Details

  • Extensive coverage of query translation techniques for improved retrieval accuracy.
  • Implementations for advanced indexing strategies like Multi-Representation, RAPTOR, and ColBERT.
  • Integration with LangGraph for building complex, agentic RAG workflows.
  • Includes implementations for Corrective RAG (CRAG) and Self-RAG.

Maintenance & Community

The project appears to be a personal or educational initiative by labdmitriy, based on Lance Martin's original work. There are no explicit mentions of community channels or active development beyond the provided content.

Licensing & Compatibility

The repository does not explicitly state a license. The included code snippets and references to external projects suggest potential compatibility with various open-source libraries, but commercial use or closed-source linking would require explicit license verification.

Limitations & Caveats

The project is presented as an educational environment and may not be production-ready without further refinement. Some advanced features, like ColBERT, might have specific hardware or dependency requirements not fully detailed. The lack of explicit licensing is a significant caveat for adoption.

Health Check
Last commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.