PyTorch SDK for structural pruning of deep neural networks
Top 15.8% on sourcepulse
Torch-Pruning (TP) is a PyTorch-based framework designed for structural pruning of deep neural networks, enabling significant model compression. It targets researchers and engineers working with large models like LLMs, Vision Transformers, and Diffusion Models, offering a general-purpose toolkit to reduce model size and latency.
How It Works
TP employs a novel graph-based algorithm called DepGraph to automatically identify and manage parameter dependencies across layers. Unlike masking-based pruning, DepGraph groups coupled parameters for simultaneous removal, ensuring structural integrity. This approach simplifies the complex task of pruning interconnected layers in modern neural architectures.
Quick Start & Requirements
pip install torch-pruning --upgrade
Highlighted Details
Maintenance & Community
The project is actively maintained, with recent updates including support for DeepSeek-R1-Distill and ongoing work on LLM examples. It is associated with the xML Lab at the National University of Singapore. Community discussions are available via WeChat groups.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The README does not specify a license, which may impact commercial adoption. Users need to manually update static attributes or forward functions if they are affected by pruning. Saving and loading pruned models requires saving the entire model object, not just state_dict
.
1 month ago
Inactive