rag-stack  by finic-ai

RAG stack for private ChatGPT-like corporate oracle

created 2 years ago
1,498 stars

Top 28.1% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

RAGstack provides a self-hosted, private ChatGPT alternative for organizations, enabling users to query internal knowledge bases. It leverages Retrieval Augmented Generation (RAG) to enhance open-source LLMs like Llama 2, Falcon, and GPT4All with proprietary data, offering a cost-effective and reliable alternative to model fine-tuning.

How It Works

RAGstack implements RAG by integrating an open-source LLM (GPT4All locally, Falcon-7b or Llama 2 on cloud GPUs) with the Qdrant vector database. User-uploaded documents are processed, embedded, and stored in Qdrant. When a query is made, relevant document chunks are retrieved from Qdrant and injected into the LLM's context via a prompt, allowing it to generate answers based on the organization's specific data.

Quick Start & Requirements

  • Local: Clone the repository, copy environment files (local.env, example.env), configure Supabase credentials, create a ragstack_users table in Supabase, and run scripts/local/run-dev. This downloads ggml-gpt4all-j-v1.3-groovy.bin and starts local services.
  • Cloud: Deployment scripts (scripts/gcp/deploy-gcp.sh, scripts/aws/deploy-aws.sh, azure/deploy-aks.sh) utilize Terraform for deploying to Google Cloud (GKE), AWS (ECS), or Azure (AKS) with GPU-accelerated models. Requires cloud provider credentials and HuggingFace token.

Highlighted Details

  • Supports local CPU-based inference with GPT4All.
  • Cloud deployments leverage GPU-enabled clusters for Falcon-7b and Llama 2.
  • Integrates with Qdrant for efficient vector storage and retrieval.
  • Terraform scripts automate deployment across GCP, AWS, and Azure.

Maintenance & Community

The project appears to be actively developed, with recent additions including GPT4All, Falcon-7b, and cloud deployment support for GCP, AWS, and Azure. No specific community channels or contributor details are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires Supabase for user management and authentication, which may be an additional dependency for some users. Cloud deployment scripts require specific cloud provider configurations and credentials. Llama-2-40b support is listed as "in progress" on the roadmap.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
3 more.

cohere-toolkit by cohere-ai

0.2%
3k
RAG toolkit for LLM application development and deployment
created 1 year ago
updated 1 week ago
Feedback? Help us improve.