Lossless compression library for AI pipelines
Top 95.9% on sourcepulse
ZipNN is a lossless compression library designed to reduce the storage footprint and improve the loading speed of AI models, particularly large language models. It targets AI researchers, developers, and users who work with large model files and need efficient storage and faster deployment. The library offers significant compression ratios and high-speed decompression, especially for BF16 models.
How It Works
ZipNN employs a data-aware compression strategy, automatically analyzing tensor data types (e.g., FP32, BF16, FP8) and applying optimized compression techniques. It leverages C implementations for core operations and supports various compression algorithms like ZSTD, LZ4, and Huffman, with an 'auto' mode selecting the best method. The library also includes specialized plugins for seamless integration with Hugging Face Transformers and vLLM, enabling compressed models to be loaded directly from the filesystem with on-the-fly CPU decompression.
Quick Start & Requirements
pip install zipnn
numpy
, zstandard
, torch
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 month ago
1 day