Code for research paper on LLM quantization via learned rotations
Top 88.6% on sourcepulse
SpinQuant addresses the challenge of reducing the computational and memory footprint of Large Language Models (LLMs) through advanced quantization techniques. It is designed for researchers and engineers working on LLM deployment and optimization, offering a method to achieve significant compression with minimal accuracy loss.
How It Works
SpinQuant introduces learned rotations, specifically utilizing Cayley transforms, to mitigate the impact of outliers in LLM weights and activations during quantization. This approach differs from static or random rotation methods by learning optimal rotation matrices, thereby improving quantization performance and reducing the accuracy gap compared to full-precision models.
Quick Start & Requirements
fast-hadamard-transform
package.
git clone https://github.com/facebookresearch/SpinQuant.git
cd SpinQuant
# Install PyTorch with CUDA from https://pytorch.org/get-started/locally/
pip install -r requirements.txt
# Install fast-hadamard-transform
git clone https://github.com/Dao-AILab/fast-hadamard-transform.git
cd fast-hadamard-transform
pip install .
10_optimize_rotation.sh
, 11_optimize_rotation_fsdp.sh
) and evaluating quantized models (2_eval_ptq.sh
). Export to ExecuTorch is also supported (31_optimize_rotation_executorch.sh
, 32_eval_ptq_executorch.sh
).access_token
) and potentially large datasets for evaluation.Highlighted Details
Maintenance & Community
The project is from Meta AI (facebookresearch) and is associated with the paper "SpinQuant: LLM Quantization with Learned Rotations." Contact information for Zechun Liu and Changsheng Zhao is provided.
Licensing & Compatibility
The project is licensed under CC-BY-NC 4.0, which restricts commercial use.
Limitations & Caveats
The CC-BY-NC 4.0 license prohibits commercial use. The README notes that results reported in the paper were run with an internal Meta codebase, and the released code is a reproduction using HuggingFace, which may lead to minor discrepancies.
5 months ago
1+ week