langchain-rag-tutorial  by pixegami

LangChain RAG application tutorial

created 1 year ago
781 stars

Top 45.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a straightforward tutorial for building a Retrieval-Augmented Generation (RAG) application using Langchain. It is designed for developers and researchers looking to implement custom document Q&A systems, offering a practical guide to integrating document loading, embedding, vector storage, and LLM querying.

How It Works

The application leverages Langchain's orchestration capabilities to build a RAG pipeline. It processes documents, generates embeddings using an LLM, stores these embeddings in a ChromaDB vector store, and then retrieves relevant document chunks to augment LLM prompts for contextually accurate answers.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Install markdown dependencies: pip install "unstructured[md]"
  • Create database: python create_database.py
  • Query database: python query_data.py "How does Alice meet the Mad Hatter?"
  • Requires an OpenAI API key set as an environment variable.
  • MacOS users may need to install onnxruntime via conda install onnxruntime -c conda-forge.
  • Windows users require Microsoft C++ Build Tools.
  • Official tutorial video: RAG+Langchain Python Project: Easy AI/Chat For Your Docs

Highlighted Details

  • Demonstrates a complete RAG workflow from data ingestion to querying.
  • Utilizes ChromaDB for efficient vector storage and retrieval.
  • Integrates with OpenAI for LLM capabilities and embeddings.

Maintenance & Community

No specific information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The setup process includes platform-specific workarounds for dependency installation, indicating potential fragility. The reliance on OpenAI API keys limits its use to those with an OpenAI account.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
80 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

super-rag by superagent-ai

0.3%
380
RAG pipeline for AI apps
created 1 year ago
updated 1 year ago
Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 20 hours ago
Feedback? Help us improve.