Discover and explore top open-source AI tools and projects—updated daily.
megvii-researchModel compression and acceleration toolbox
Top 82.5% on SourcePulse
Sparsebit is a PyTorch-based toolkit for model compression and acceleration, offering pruning and quantization capabilities. It targets researchers and engineers seeking to reduce model size and inference latency with minimal code changes, supporting both Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT).
How It Works
Sparsebit leverages torch.fx to transform PyTorch models into a QuantModel where operations become QuantModules. This modular design allows for easy extension of quantization methods, observers, and modules. For pruning, it supports structured and unstructured pruning across various model components (weights, activations, layers) using algorithms like L1/L0 norm, Fisher pruning, and Hrank, with ONNX export for pruned models.
Quick Start & Requirements
pip install sparsebitHighlighted Details
Maintenance & Community
The project is from megvii-research, with recent updates in April 2023. It references several open-source projects it was inspired by. Contact: sunpeiqin@megvii.com for team opportunities.
Licensing & Compatibility
Released under the Apache 2.0 license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
While flexible, the README focuses heavily on specific model architectures and quantization techniques (e.g., GPTQ, QAT). Broader model compatibility and performance benchmarks beyond those listed may require user validation.
1 year ago
Inactive
fpgaminer
Vahe1994
huggingface
Vahe1994
mit-han-lab
intel