zipnn  by zipnn

Lossless compression library for AI pipelines

created 1 year ago
270 stars

Top 95.9% on sourcepulse

GitHubView on GitHub
Project Summary

ZipNN is a lossless compression library designed to reduce the storage footprint and improve the loading speed of AI models, particularly large language models. It targets AI researchers, developers, and users who work with large model files and need efficient storage and faster deployment. The library offers significant compression ratios and high-speed decompression, especially for BF16 models.

How It Works

ZipNN employs a data-aware compression strategy, automatically analyzing tensor data types (e.g., FP32, BF16, FP8) and applying optimized compression techniques. It leverages C implementations for core operations and supports various compression algorithms like ZSTD, LZ4, and Huffman, with an 'auto' mode selecting the best method. The library also includes specialized plugins for seamless integration with Hugging Face Transformers and vLLM, enabling compressed models to be loaded directly from the filesystem with on-the-fly CPU decompression.

Quick Start & Requirements

Highlighted Details

  • Achieves up to 80GB/s decompression and 13GB/s compression on multi-NUMA CPUs.
  • Supports FP8 (e4m3fn, e5m2) models.
  • Integrates with vLLM and Hugging Face via safetensors and HF transformers plugins.
  • Offers command-line scripts for batch compression/decompression.
  • BF16 models typically see a 33% size reduction.

Maintenance & Community

  • Latest release: v0.5.3 (adds FP8 support).
  • Active development with regular updates noted in the changelog.
  • Contact: zipnn_compression@gmail.com

Licensing & Compatibility

  • License: Not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • GPU implementations are noted as "on the way."
  • The license is not specified, which may impact commercial adoption.
  • Some integrations (like vLLM in containers) require building custom images or using pre-built ones.
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
27 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Julien Chaumond Julien Chaumond(Cofounder of Hugging Face), and
1 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
created 4 years ago
updated 2 years ago
Feedback? Help us improve.