neighbor  by ankane

Rails gem for nearest neighbor search

created 4 years ago
718 stars

Top 48.9% on sourcepulse

GitHubView on GitHub
Project Summary

Neighbor is a Ruby gem providing efficient nearest neighbor search capabilities for Ruby on Rails applications. It integrates with various database extensions and types, enabling developers to implement similarity search for embeddings and other vector data directly within their ActiveRecord models.

How It Works

Neighbor leverages database-specific extensions like PostgreSQL's cube and pgvector, SQLite's sqlite-vec, MariaDB, and MySQL to store and query vector data. It provides an ActiveRecord interface (has_neighbors) that maps to these underlying database features, allowing users to define vector columns and perform nearest neighbor searches using various distance metrics (Euclidean, cosine, Hamming, etc.). The gem also supports indexing strategies like HNSW and IVFFlat for performance optimization.

Quick Start & Requirements

  • Install via Bundler: gem "neighbor"
  • Requires a compatible database (Postgres, SQLite, MariaDB, MySQL) and relevant extensions/types.
  • Setup involves generating migrations to add vector columns and configuring models with has_neighbors.
  • Official documentation and examples are available for detailed setup and usage: https://github.com/ankane/neighbor

Highlighted Details

  • Supports multiple distance metrics: euclidean, cosine, taxicab, chebyshev, inner_product, hamming, jaccard.
  • Offers advanced features like half-precision vectors, binary vectors, sparse vectors, and indexing options (HNSW, IVFFlat).
  • Integrates with external embedding models (OpenAI, Cohere, Informers, Transformers.rb) for generating and searching embeddings.
  • Provides examples for hybrid search, recommendations, and sparse search.

Maintenance & Community

  • Developed by ankane, a known contributor in the Ruby data science and ML space.
  • The repository is active, with recent commits and issues.
  • Contribution guidelines and development setup instructions are provided.

Licensing & Compatibility

  • Released under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

  • Experimental support for MariaDB and MySQL may have limitations or require specific versions/configurations.
  • Performance heavily relies on the underlying database's vector capabilities and indexing strategies.
  • Some advanced features or specific distance metrics might be tied to particular database extensions.
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
45 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

pgvector-node by pgvector

0.8%
399
Node.js library for pgvector support
created 4 years ago
updated 2 weeks ago
Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Hiroshi Shibata Hiroshi Shibata(Core Contributor to Ruby), and
2 more.

searchkick by ankane

0.0%
7k
Ruby gem for integrating intelligent search
created 12 years ago
updated 1 month ago
Feedback? Help us improve.