RAG-Driven-Generative-AI  by Denis2054

Code repo for a RAG-driven GenAI book

created 1 year ago
474 stars

Top 65.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code examples and explanations for building Retrieval Augmented Generation (RAG) pipelines, targeting developers and researchers interested in advanced AI applications. It offers practical guidance on integrating various frameworks and models to create more accurate, traceable, and multimodal generative AI systems.

How It Works

The project demonstrates RAG by connecting large language models (LLMs) with external data sources through vector stores. It emphasizes techniques like adaptive RAG, human feedback integration, and knowledge graph construction to improve retrieval accuracy, reduce hallucinations, and enable multimodal data processing (text and images). The approach balances performance and cost by leveraging tools like LlamaIndex, Deep Lake, Pinecone, and models from OpenAI and Hugging Face.

Quick Start & Requirements

  • Notebooks can be run directly via Google Colaboratory or Kaggle.
  • Requires API tokens for OpenAI, Activeloop, and Pinecone.
  • Key dependencies include deeplake (3.9.18), openai (1.40.3), transformers (4.41.2), numpy (>=1.24.1), and deepspeed (0.10.1).
  • A requirements_01.txt file is available for environment setup.
  • See Changelog for updates and bonus notebooks.

Highlighted Details

  • Covers RAG implementation with various OpenAI models, including Grok-beta, o1-preview, o3, and GPT-4.5-preview.
  • Demonstrates multimodal RAG for drone technology and video stock production.
  • Explores advanced RAG techniques like adaptive RAG, knowledge graph integration, and fine-tuning.
  • Includes pipelines for scaling RAG with large datasets using Pinecone and Chroma.

Maintenance & Community

The repository is actively maintained and updated, with a changelog detailing improvements. The author, Denis Rothman, has extensive experience in NLP and AI, with a background at institutions like Sorbonne University and companies like Airbus and IBM.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification on the license.

Limitations & Caveats

The project relies heavily on external API keys (OpenAI, Activeloop, Pinecone), which may incur costs. While notebooks are provided for ease of use, running them locally might require careful dependency management and specific environment configurations.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
54 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.