VectorRAG.Net  by likeslines-maker

.NET-native embedded vector database for RAG and semantic search

Created 3 months ago
372 stars

Top 75.8% on SourcePulse

GitHubView on GitHub
Project Summary

.NET developers seeking a high-performance, embedded vector database for semantic search and Retrieval-Augmented Generation (RAG) will find VectorRAG.Net a compelling solution. This library allows for local AI assistants, offline knowledge bases, and low-latency applications by running directly within the application process, eliminating external dependencies and network overhead.

How It Works

VectorRAG.Net employs a fully embedded, in-process architecture tailored for .NET 8 applications. Its core search mechanism combines Random Hyperplane LSH for efficient candidate generation with an exact rerank using dot/cosine similarity, providing a robust approximate nearest neighbor (ANN) search. The library also supports hybrid search, integrating BM25 keyword relevance scoring with vector similarity for more comprehensive results.

Quick Start & Requirements

  • Installation: dotnet add package VectorRAG.Net
  • Prerequisites: .NET 8.
  • Links: NuGet, GitHub

Highlighted Details

  • Features SIMD-optimized vector search and LSH-based ANN indexing.
  • Supports Hybrid Search (Vector + BM25) and metadata filtering.
  • Includes automatic document chunking and snapshot save/load capabilities.
  • Claims high performance: Vector Search (TopK=5) at 15.15 μs, Hybrid Search at 116.73 μs (benchmarked on 10,000 docs, 64-dim embeddings).

Maintenance & Community

No specific details regarding contributors, sponsorships, or community channels (e.g., Discord, Slack) are provided in the README.

Licensing & Compatibility

VectorRAG.Net is commercial software. Free usage is permitted for evaluation, development, research, educational purposes, and proof-of-concept projects. Production use requires an active commercial subscription ($1,000 USD per month per organization). It is a .NET 8 native implementation.

Limitations & Caveats

The primary limitation is the commercial licensing requirement for production deployment. Performance benchmarks are specific to the tested environment and workload; actual results will vary based on embedding dimensions, dataset size, and hardware.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
444 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.