Awesome-RAG by Danielskry

Awesome list of RAG resources

Created 1 year ago

1,034 stars

Top 36.0% on SourcePulse

Project Summary

This repository is an "Awesome List" curating applications, frameworks, techniques, metrics, and databases for Retrieval-Augmented Generation (RAG) in Generative AI. It serves as a comprehensive resource for researchers and developers looking to understand and implement RAG systems, enabling LLMs to leverage external, up-to-date, or specific information for more accurate and tailored responses.

How It Works

RAG enhances LLMs by retrieving relevant context from external knowledge bases before generating a response. The core process involves chunking documents, creating vector embeddings for semantic search, storing these in a vector database, and then retrieving relevant chunks based on user query embeddings to augment the LLM's prompt. This approach allows LLMs to access information beyond their training data, improving factual accuracy and specificity.

Quick Start & Requirements

This is a curated list, not a runnable application. To implement RAG, users will need to select and integrate various components like LLMs, embedding models, vector databases, and orchestration frameworks. Links to specific tools and frameworks are provided within the list for further exploration.

Highlighted Details

Extensive categorization of RAG approaches, including Agentic RAG, CRAG, RAFT, and GraphRAG.
Comprehensive listing of RAG frameworks such as Haystack, LangChain, LlamaIndex, and Flowise.
Detailed breakdown of techniques covering data cleaning, chunking strategies, embedding models, and retrieval methods.
In-depth coverage of evaluation metrics (e.g., BLEU, ROUGE, Groundedness) and popular vector databases (e.g., Chroma, Milvus, Pinecone, Qdrant, Weaviate).

Maintenance & Community

This is a community-driven "Awesome List." Contributions are welcomed to expand and update the resource. Specific contributor details or community channels are not highlighted in the README.

Licensing & Compatibility

The repository itself is a list of resources and does not have a specific license. The licenses of the individual tools and frameworks mentioned vary and should be checked independently.

Limitations & Caveats

As a curated list, this repository does not provide a ready-to-use RAG system. Users must select, configure, and integrate the various components themselves. The rapidly evolving nature of RAG means the list may require continuous updates to remain comprehensive.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

82 stars in the last 30 days