DNN training/compression framework (research paper)
Top 88.2% on sourcepulse
This repository provides the deprecated PyTorch implementation of Only Train Once (OTO), an automatic, architecture-agnostic framework for simultaneous DNN training and compression via structured pruning and erasing operators. It targets researchers and practitioners seeking to achieve high performance and reduced model size in a single training pass without fine-tuning.
How It Works
OTO employs a two-stage process. First, it constructs a pruning dependency graph to partition DNN variables into "Pruning Zero-Invariant Groups" (PZIGs), representing minimal structural units that must be pruned together. Second, a hybrid structured sparse optimizer (e.g., HESSO) identifies redundant PZIGs by formulating a structured sparsity problem. These identified PZIGs are then removed, resulting in a pruned model that maintains the exact same output as the original, thus eliminating the need for post-training fine-tuning.
Quick Start & Requirements
pip install only_train_once
or clone the repository.Highlighted Details
Maintenance & Community
The project has migrated to microsoft/only_train_once
. This repository is deprecated but maintained for historical purposes. Key publications include OTOv1 (NeurIPS 2021), OTOv2 (ICLR 2023), OTOv3 (preprint), and LoRAShear (LLM pruning, preprint).
Licensing & Compatibility
The repository does not explicitly state a license. However, the migration to microsoft/only_train_once
suggests a potential shift to a Microsoft-approved license. Users should verify licensing for commercial or closed-source integration.
Limitations & Caveats
This is a deprecated PyTorch implementation. The official, actively maintained version is at microsoft/only_train_once
. Some advanced features like the HESSO optimizer's technical report and official erasing mode are still pending release or under review.
10 months ago
1 day