NanoSage by masterFoad

Local LLM powered recursive search and report generation

Created 11 months ago

259 stars

Top 97.9% on SourcePulse

Project Summary

NanoSage is a local, recursive search and knowledge exploration tool designed for deep research. It assists users by systematically breaking down queries, building a knowledge base from local and web data, and dynamically exploring subqueries with a focus on relevance and depth. The tool is ideal for researchers, students, and power users seeking structured, in-depth reports generated via retrieval-augmented generation (RAG) on their own hardware.

How It Works

NanoSage employs a structured, relevance-driven recursive search pipeline. It refines user queries, builds a knowledge base from local files and web searches, and tracks its exploration progress via a Table of Contents (TOC). Monte Carlo-based exploration balances search breadth and depth, ranking subqueries by relevance to maintain precision. The system then generates a detailed Markdown report using RAG, integrating insights from the most valuable findings.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.8+, Ollama with Gemma 2B model (ollama pull gemma2:2b). Optional GPU acceleration requires PyTorch with CUDA.
Run: python main.py --query "Your query here" --web_search --max_depth 2 --device cpu
Docs: Example Report

Highlighted Details

Recursive search with relevance-based subquery pruning.
Integrates local files (PDFs, text) and web search results.
Supports RAG with models like Gemma 2B for report generation.
Generates structured Markdown reports detailing the research journey.

Maintenance & Community

Project maintained by Foad Abo Dahood.
BibTeX and APA citation provided for academic use.

Licensing & Compatibility

MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

The system's effectiveness is dependent on the quality of the chosen retrieval model and the LLM used for summarization. Performance may vary based on hardware, especially when using CPU-only mode. Recursion depth and relevance thresholds require tuning for optimal results.

NanoSage by masterFoad

Explore Similar Projects

multimodal-search-r1 by EvolvingLMMs-Lab

Deeper-Seeker by HarshJ23

stark by snap-stanford

deep-research-py by epuerta9

aiq-research-assistant by NVIDIA-AI-Blueprints

GraphRAG4OpenWebUI by win4r

rag-from-scratch by pguso

open-deep-research by btahir

elasticsearch-labs by elastic

local-deep-researcher by langchain-ai

deep-research by dzhng

llmware by llmware-ai