NyRAG by abhishekkrthakur

No-code RAG framework for scalable knowledge retrieval

Created 2 weeks ago

New!

278 stars

Top 93.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Abhishek Thakur

World's First 4x Kaggle GrandMaster

Jon Bratseth

Cofounder of Vespa

Project Summary

Summary

NyRAG provides an advanced, scalable, no-code solution for building Retrieval-Augmented Generation (RAG) applications. It targets developers and researchers needing to integrate external knowledge into LLM responses by processing websites or documents and deploying them via Vespa for hybrid search, complete with a chat UI. The primary benefit is simplifying complex RAG pipelines for more comprehensive and grounded AI answers.

How It Works

NyRAG employs a sophisticated multi-stage retrieval process. User queries are first enhanced by an LLM to generate multiple search terms. These queries are then converted into embeddings using SentenceTransformer models. Vespa performs a hybrid search using these embeddings, retrieving relevant document chunks ranked by a best_chunk_score profile. Results are fused, deduplicated, and top-k chunks are selected. Finally, an LLM synthesizes a grounded answer based solely on the retrieved context. This multi-query, chunk-level approach enhances retrieval coverage and accuracy.

Quick Start & Requirements

Installation is straightforward via pip install nyrag. The project supports two deployment modes: 'Local' (using Docker for Vespa) and 'Cloud' (deploying to Vespa Cloud). Data ingestion supports 'Web' crawling or 'Docs' processing. Configuration is managed through YAML files and environment variables. Key requirements include Python 3.10+, Docker (for local), a Vespa Cloud tenant (for cloud), and an OpenAI-compatible LLM API endpoint and key. Setup involves defining crawl/doc parameters, RAG settings (embedding model, chunk size), and LLM configurations. Links to Ollama (https://ollama.ai) and LM Studio (https://lmstudio.ai) are provided for local LLM setup.

Highlighted Details

Advanced, scalable, no-code RAG pipeline.
Multi-query retrieval strategy for improved context gathering.
Hybrid search capabilities powered by Vespa.
Extensive LLM support via OpenAI-compatible APIs (OpenRouter, Ollama, LM Studio, vLLM, OpenAI).
Flexible deployment options: Local (Docker) and Cloud (Vespa Cloud).

Maintenance & Community

The provided README does not contain specific details regarding maintainers, community channels (like Discord/Slack), or project roadmaps.

Licensing & Compatibility

The README does not explicitly state the project's license type or provide compatibility notes for commercial use.

Limitations & Caveats

Users must configure LLM API access and potentially set up Docker or Vespa Cloud environments. The lack of explicit licensing information may pose a barrier for commercial adoption or contribution. The system's effectiveness is dependent on the quality of the source data and the chosen LLM.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

281 stars in the last 20 days