Discover and explore top open-source AI tools and projects—updated daily.
lightonaiMulti-vector search and semantic code intelligence for agents
Top 88.4% on SourcePulse
Summary
NextPlaid is a local-first, multi-vector search engine designed for general retrieval workloads, while ColGREP is a specialized semantic code search tool built upon it. This combination offers developers and power users a powerful, privacy-preserving way to search codebases and other data locally. The primary benefit is enabling deep, semantic understanding of code and text without sending sensitive data to external servers, leveraging efficient, CPU-first indexing and search capabilities.
How It Works
ColGREP operates as a single, dependency-free Rust binary, integrating regex filtering with semantic ranking for code search. It automatically updates its index incrementally as files change. NextPlaid, the underlying engine, employs a multi-vector approach to semantic search. It uses Tree-sitter to parse code into structured representations, generating approximately 300 embeddings per code unit for richer context than single-vector methods. This is achieved through product quantization (2-bit/4-bit) and memory-mapped indices, enabling efficient disk-based storage and low RAM usage. NextPlaid supports incremental updates, metadata pre-filtering via SQLite, and is optimized for CPU, with optional CUDA support.
Quick Start & Requirements
For ColGREP, installation is available via Homebrew (brew install lightonai/tap/colgrep) or a shell installer script (curl ... | sh). After installation, build the index with colgrep init /path/to/project or colgrep init in the current directory. Searches are then performed with colgrep "query". ColGREP is a single Rust binary with no external dependencies.
NextPlaid is run via Docker. CPU images are ghcr.io/lightonai/next-plaid:cpu-1.1.3, and GPU images are ghcr.io/lightonai/next-plaid:cuda-1.1.3. A Python client is available via pip install next-plaid-client. Requirements include Docker for the server and Python 3.x for the client. GPU usage necessitates CUDA-compatible hardware.
Highlighted Details
Maintenance & Community
Key contributors include Raphaël Sourty, Artem Dinaburg, Igor Carron, Chao-Chun (Joe) Hsu, Raymond Weitekamp, Szymon Rączka, and Mark Motliuk. No specific community channels (e.g., Discord, Slack) or roadmap links were detailed in the provided README snippet.
Licensing & Compatibility
The project is licensed under the Apache-2.0 license. This permissive license allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
Integrations for OpenCode and Codex are currently basic, with contributions welcomed. NextPlaid is positioned for serving and streaming ingestion, while its companion FastPlaid is recommended for bulk offline indexing and experiments.
1 week ago
Inactive