tensorizer  by coreweave

Module for fast model serialization/deserialization

created 2 years ago
250 stars

Top 100.0% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides a fast and efficient method for serializing and deserializing large machine learning models and their tensors. It targets ML engineers and researchers deploying models, enabling significantly reduced model load times from various storage backends like HTTP/S, Redis, and S3.

How It Works

Tensorizer serializes model weights into a single, optimized file. This approach decouples model artifacts from container images, reducing image size and deployment latency. It leverages network-bound deserialization speeds, achieving wire-speed loading on high-speed networks. The library supports streaming loads directly from S3 or HTTP/S endpoints without requiring local disk storage.

Quick Start & Requirements

  • Install via pip: python -m pip install tensorizer
  • Requires transformers and accelerate libraries for model serialization/deserialization examples.
  • S3 usage requires AWS credentials configuration (e.g., ~/.s3cfg) or direct credential passing.
  • Some tests require a GPU.

Highlighted Details

  • Achieves ~5GB/s load speeds for a 20GB GPT-J model on a 40GbE network.
  • Supports optional, fast tensor weight encryption/decryption using libsodium.
  • Offers concurrent read capabilities for improved performance on network-bound operations.
  • Provides direct state_dict compatibility for torch.nn.Module.load_state_dict.

Maintenance & Community

  • Developed by CoreWeave.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

  • Preliminary support for Redis is not recommended for model deployment.
  • Quantized datatypes (e.g., qint8) are not currently supported due to missing quantization parameters.
  • plaid_mode is deprecated and has no effect.
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.