nucliadb  by nuclia

AI search database for RAG

Created 3 years ago
705 stars

Top 48.4% on SourcePulse

GitHubView on GitHub
Project Summary

Nucliadb is an AI search database designed for efficient storage and hybrid searching of unstructured data, targeting developers building RAG applications. It offers a unified API for text, vector, and graph indexing, simplifying data extraction and enrichment for NLP tasks.

How It Works

Nucliadb employs a hybrid search architecture, combining vector, full-text, and graph indexes to provide comprehensive search capabilities. Written in Rust and Python, it's optimized for indexing large datasets and supports multi-tenancy. The system can leverage Nuclia's cloud-based Understanding API for automated data extraction, enrichment, and inference, reducing the complexity for users.

Quick Start & Requirements

  • Install: Instructions for installation are available via the Quick start guide.
  • Prerequisites: PostgreSQL for storage, S3-compatible API, GCS, or Azure Blob Storage for blobs.
  • Resources: Details on resource requirements are available in the Nuclia Docs.

Highlighted Details

  • Hybrid search: Supports vector, full-text, and graph indexes.
  • Data types: Stores text, files, vectors, labels, and annotations.
  • NLP integration: Leverages Nuclia's Understanding API for data extraction and enrichment.
  • Cloud-native: Designed for distributed search and replication.

Maintenance & Community

  • Community: Active Slack channel for discussion and contributions.
  • Development: Project appears to be actively maintained with contributions welcomed.
  • Resources: Links to Nuclia Docs, API Reference, and Community Chat are provided.

Licensing & Compatibility

  • License: GNU Affero General Public License v3.0 (AGPL-3.0).
  • Restrictions: AGPL-3.0 is a strong copyleft license. Modifications to the source code must be made publicly available under the same license. Commercial use or linking with closed-source applications may require careful consideration due to the copyleft provisions.

Limitations & Caveats

The AGPL-3.0 license imposes significant obligations on users who modify the software or use it in a service, requiring them to make their own source code available. The reliance on Nuclia's cloud APIs for advanced features may introduce vendor lock-in or additional costs.

Health Check
Last Commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)
32
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
12 more.

mindsdb by mindsdb

0.3%
36k
AI query engine for federated data sources
Created 7 years ago
Updated 16 hours ago
Feedback? Help us improve.