turboquant-wasm by teamchong

Vector compression and fast search for web and edge

Created 4 weeks ago

New!

308 stars

Top 87.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Project Summary

This repository provides an experimental WebAssembly (WASM) build of TurboQuant, enabling efficient vector compression and fast dot product searches directly on compressed data within browsers and Node.js environments. It targets developers needing to reduce the memory footprint of embedding vectors for applications like real-time search, image similarity, or LLM KV cache compression, offering a significant 6x size reduction without requiring a training step.

How It Works

The project leverages the TurboQuant algorithm, employing polar decomposition and QJL rotation, implemented in Zig and compiled to WASM with relaxed SIMD instructions for CPU execution. It offers a dual-substrate approach: the core turboquant-wasm npm package utilizes WASM for general vector operations, while an optional WebGPU compute shader path accelerates dotBatch operations by processing compressed vectors directly on the GPU. This design provides a single, dependency-light package with transparent fallback mechanisms.

Quick Start & Requirements

Install: npm install turboquant-wasm
Prerequisites:
- Runtime: Chrome 114+, Firefox 128+, Safari 18+, Node.js 20+ (for relaxed SIMD).
- WebGPU: Chrome 113+, Edge 113+ (for GPU acceleration).
- Build: Zig 0.15.2 and Bun are required for building from source.
Documentation: Refer to the TypeScript API and the original TurboQuant research paper ("TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate", ICLR 2026).

Highlighted Details

Achieves approximately 4.5 bits per dimension (bpd), yielding a ~6x compression ratio compared to Float32.
Enables direct dot product calculations on compressed vectors, avoiding costly decompression.
Features a "no-training" approach, allowing immediate encoding of any vector.
The dotBatch method transparently utilizes WebGPU compute shaders when available, falling back to WASM SIMD.
Verified byte-identical output with the reference Zig implementation via golden-value tests.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmap were found in the provided README text.

Licensing & Compatibility

License: MIT.
Compatibility: The MIT license permits commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

This is an experimental build, and the compression ratio (~4.5 bpd) is less aggressive than methods like PQ/OPQ (1-2 bpd). Query speed for dot operations is slower than PQ/OPQ due to per-pair decompression, though dotBatch offers acceleration. Support relies on modern browser/Node.js runtimes with relaxed SIMD capabilities.

turboquant-wasm by teamchong

Explore Similar Projects

awesome-vector-database by dangkhoasdc

turboquant-gpu by DevTechJr

turbovec by RyanCodrai

pyturboquant by jorgebmann

triattention by WeianMao

VectorChord by tensorchord

pgvectorscale by timescale

kvpress by NVIDIA

jvector by datastax

marqo by marqo-ai

pgvector by pgvector

faiss by facebookresearch