tensorizer  by coreweave

Module for fast model serialization/deserialization

Created 2 years ago
265 stars

Top 96.6% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides a fast and efficient method for serializing and deserializing large machine learning models and their tensors. It targets ML engineers and researchers deploying models, enabling significantly reduced model load times from various storage backends like HTTP/S, Redis, and S3.

How It Works

Tensorizer serializes model weights into a single, optimized file. This approach decouples model artifacts from container images, reducing image size and deployment latency. It leverages network-bound deserialization speeds, achieving wire-speed loading on high-speed networks. The library supports streaming loads directly from S3 or HTTP/S endpoints without requiring local disk storage.

Quick Start & Requirements

  • Install via pip: python -m pip install tensorizer
  • Requires transformers and accelerate libraries for model serialization/deserialization examples.
  • S3 usage requires AWS credentials configuration (e.g., ~/.s3cfg) or direct credential passing.
  • Some tests require a GPU.

Highlighted Details

  • Achieves ~5GB/s load speeds for a 20GB GPT-J model on a 40GbE network.
  • Supports optional, fast tensor weight encryption/decryption using libsodium.
  • Offers concurrent read capabilities for improved performance on network-bound operations.
  • Provides direct state_dict compatibility for torch.nn.Module.load_state_dict.

Maintenance & Community

  • Developed by CoreWeave.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

  • Preliminary support for Redis is not recommended for model deployment.
  • Quantized datatypes (e.g., qint8) are not currently supported due to missing quantization parameters.
  • plaid_mode is deprecated and has no effect.
Health Check
Last Commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
Created 4 years ago
Updated 8 months ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.