search  by kelindar

Go library for embedded vector search and semantic embeddings

created 10 months ago
478 stars

Top 64.9% on sourcepulse

GitHubView on GitHub
Project Summary

This Go library provides embedded vector search and semantic embeddings for small to medium-scale projects, targeting Go developers needing efficient semantic capabilities without complex infrastructure. It leverages llama.cpp and GGUF BERT models for accurate, fast, and lean semantic search with optional GPU acceleration.

How It Works

The library utilizes llama.cpp, accessed via purego to avoid cgo, enabling direct interaction with shared C libraries from Go. This simplifies integration and cross-compilation. It supports GGUF-formatted BERT models for generating text embeddings. For search, it implements a brute-force nearest neighbor approach with SIMD optimizations, suitable for datasets under 100,000 entries, and allows saving/loading search indexes.

Quick Start & Requirements

  • Install: Precompiled binaries are available in the dist directory for Windows and Linux. For other platforms or custom builds, compile from source.
  • Prerequisites: C/C++ compiler and CMake are required for compilation. Vulkan SDK is needed for GPU acceleration.
  • Resources: GPU acceleration is recommended for performance.
  • Docs: llama.cpp build documentation for GPU options.

Highlighted Details

  • llama.cpp without cgo: Simplifies Go integration and cross-compilation.
  • GGUF BERT Model Support: Leverages a wide range of BERT models.
  • Vulkan GPU Acceleration: Precompiled binaries include Vulkan support for Windows and Linux.
  • Search Index: Supports saving and loading indexes for persistent vector search.

Maintenance & Community

The project is maintained by kelindar. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The brute-force search approach leads to performance bottlenecks on datasets exceeding 100,000 entries. It lacks advanced query features like multi-field filtering or fuzzy matching, and handling high-dimensional embeddings in real-time requires sufficient GPU resources.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
39 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
5 more.

gemma.cpp by google

0.1%
7k
C++ inference engine for Google's Gemma models
created 1 year ago
updated 1 day ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 14 hours ago
Feedback? Help us improve.