Discover and explore top open-source AI tools and projects—updated daily.
Content-addressable storage for LLMs and applications
Top 81.1% on SourcePulse
YAMS is a content-addressable storage system designed for LLMs and applications, offering deduplication, full-text, and semantic search capabilities. It targets developers and researchers needing persistent, versioned, and easily searchable data storage, providing efficient data integrity and retrieval.
How It Works
YAMS utilizes SHA-256 hashing for content addressing, ensuring data integrity and immutability. Block-level deduplication is achieved via Rabin fingerprinting. It supports both full-text search using SQLite FTS5 and semantic search through vector embeddings. Crash recovery is managed with a write-ahead logging system, and the architecture is thread-safe, enabling high performance with reported throughputs exceeding 100MB/s.
Quick Start & Requirements
docker run --rm -it ghcr.io/trvon/yams:latest --version
) or build from source using Conan (recommended).brew install openssl@3 protobuf sqlite3 ncurses
. Linux: apt install libssl-dev libsqlite3-dev protobuf-compiler libncurses-dev
.yams init --non-interactive
.Highlighted Details
Maintenance & Community
The project is actively maintained by trvon. Community channels are not explicitly mentioned in the README.
Licensing & Compatibility
Licensed under Apache-2.0, which permits commercial use and linking with closed-source projects.
Limitations & Caveats
Traditional CMake builds (without Conan) are noted to have dependency resolution issues; Conan builds are recommended. PDF extraction may fail if PDFium download is blocked by firewalls. Retrieval by name is listed as "coming soon."
18 hours ago
Inactive