Wax  by christopherkarani

On-device RAG and memory for Swift AI agents

Created 1 month ago
573 stars

Top 56.3% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Wax provides an on-device memory layer for AI agents, simplifying the integration of complex RAG pipelines into Swift applications. It replaces multi-service architectures with a serverless, single-file solution, enabling private, portable, and deterministic memory management for iOS and macOS developers.

How It Works

Wax consolidates documents, embeddings, BM25 full-text search, HNSW vector indexes, and crash-recovery logs into a single .mv2s file. This file format is append-only, checksum-verified, and uses a dual-header for atomic updates, ensuring durability and portability without external dependencies or network calls. It leverages Metal GPU acceleration for vector search on Apple Silicon, offering significant performance gains.

Quick Start & Requirements

  • Primary install / run command: Add to Package.swift using .package(url: "https://github.com/christopherkarani/Wax.git", from: "0.1.6").
  • Non-default prerequisites and dependencies: Swift 6.2, iOS 26 / macOS 26. Apple Silicon is required for Metal GPU features.
  • Links: Repository: https://github.com/christopherkarani/Wax.git.

Highlighted Details

  • Performance: Achieves 0.84ms vector search latency on Apple Silicon (M1 Pro, 10K docs, Metal GPU) and 105ms CPU-bound search. Cold open to first query is 17ms.
  • Features: Implements Query-Adaptive Hybrid Search, fusing BM25, vector, temporal, and structured results based on query type. Offers Tiered Memory Compression (full, gist, micro summaries) for efficient context retrieval. Features Deterministic Token Budgeting for reproducible RAG outputs.
  • Benchmarks: Reproducible XCTest benchmarks for ingest throughput (up to ~3236 docs/s) and search latency are available, with detailed methodology in the repo.
  • Storage Health: Includes a WAL/storage health track focusing on commit latency tails, file growth, and recovery behavior, with optimizations for compaction and replay.

Maintenance & Community

Developed by Christopher Karani. The README does not specify community channels (e.g., Discord, Slack) or sponsorship details.

Licensing & Compatibility

The README does not explicitly state the project's license. This requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

Stress recall performance is currently blocked by a harness issue (signal 11). Advanced WAL maintenance features, such as proactive pressure commits and scheduled live-set rewrites, are guarded by default due to workload sensitivity and require explicit configuration.

Health Check
Last Commit

8 hours ago

Responsiveness

Inactive

Pull Requests (30d)
31
Issues (30d)
6
Star History
575 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.