turboquant-wasm  by teamchong

Vector compression and fast search for web and edge

Created 4 weeks ago

New!

308 stars

Top 87.1% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides an experimental WebAssembly (WASM) build of TurboQuant, enabling efficient vector compression and fast dot product searches directly on compressed data within browsers and Node.js environments. It targets developers needing to reduce the memory footprint of embedding vectors for applications like real-time search, image similarity, or LLM KV cache compression, offering a significant 6x size reduction without requiring a training step.

How It Works

The project leverages the TurboQuant algorithm, employing polar decomposition and QJL rotation, implemented in Zig and compiled to WASM with relaxed SIMD instructions for CPU execution. It offers a dual-substrate approach: the core turboquant-wasm npm package utilizes WASM for general vector operations, while an optional WebGPU compute shader path accelerates dotBatch operations by processing compressed vectors directly on the GPU. This design provides a single, dependency-light package with transparent fallback mechanisms.

Quick Start & Requirements

  • Install: npm install turboquant-wasm
  • Prerequisites:
    • Runtime: Chrome 114+, Firefox 128+, Safari 18+, Node.js 20+ (for relaxed SIMD).
    • WebGPU: Chrome 113+, Edge 113+ (for GPU acceleration).
    • Build: Zig 0.15.2 and Bun are required for building from source.
  • Documentation: Refer to the TypeScript API and the original TurboQuant research paper ("TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate", ICLR 2026).

Highlighted Details

  • Achieves approximately 4.5 bits per dimension (bpd), yielding a ~6x compression ratio compared to Float32.
  • Enables direct dot product calculations on compressed vectors, avoiding costly decompression.
  • Features a "no-training" approach, allowing immediate encoding of any vector.
  • The dotBatch method transparently utilizes WebGPU compute shaders when available, falling back to WASM SIMD.
  • Verified byte-identical output with the reference Zig implementation via golden-value tests.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmap were found in the provided README text.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: The MIT license permits commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

This is an experimental build, and the compression ratio (~4.5 bpd) is less aggressive than methods like PQ/OPQ (1-2 bpd). Query speed for dot operations is slower than PQ/OPQ due to per-pair decompression, though dotBatch offers acceleration. Support relies on modern browser/Node.js runtimes with relaxed SIMD capabilities.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
8
Issues (30d)
1
Star History
307 stars in the last 29 days

Explore Similar Projects

Feedback? Help us improve.