Torch-Pruning  by VainF

PyTorch SDK for structural pruning of deep neural networks

created 5 years ago
3,102 stars

Top 15.8% on sourcepulse

GitHubView on GitHub
Project Summary

Torch-Pruning (TP) is a PyTorch-based framework designed for structural pruning of deep neural networks, enabling significant model compression. It targets researchers and engineers working with large models like LLMs, Vision Transformers, and Diffusion Models, offering a general-purpose toolkit to reduce model size and latency.

How It Works

TP employs a novel graph-based algorithm called DepGraph to automatically identify and manage parameter dependencies across layers. Unlike masking-based pruning, DepGraph groups coupled parameters for simultaneous removal, ensuring structural integrity. This approach simplifies the complex task of pruning interconnected layers in modern neural architectures.

Quick Start & Requirements

  • Install via pip: pip install torch-pruning --upgrade
  • Requires PyTorch (1.x and 2.x) and NumPy.
  • AutoGrad must be enabled during dependency graph building.
  • See Tutorials for detailed usage.

Highlighted Details

  • Supports pruning a wide array of models from Huggingface, Timm, and Torchvision, including LLMs, SAM, and various CNNs/Transformers.
  • Offers high-level pruners for effortless pruning with options for global pruning, isomorphic pruning, and custom pruning ratios per layer or block.
  • Includes utilities for calculating MACs and parameters before and after pruning, and supports optional sparse training.
  • Demonstrates significant latency reduction on ResNet-50, achieving ~5.7x speedup at 95% pruning ratio without fine-tuning.

Maintenance & Community

The project is actively maintained, with recent updates including support for DeepSeek-R1-Distill and ongoing work on LLM examples. It is associated with the xML Lab at the National University of Singapore. Community discussions are available via WeChat groups.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. Users need to manually update static attributes or forward functions if they are affected by pruning. Saving and loading pruned models requires saving the entire model object, not just state_dict.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
6
Star History
125 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

wanda by locuslab

0%
782
LLM pruning research paper implementation
created 2 years ago
updated 11 months ago
Feedback? Help us improve.