Discover and explore top open-source AI tools and projects—updated daily.
AIoT-MLSys-LabCompressing LLMs with Singular Value Decomposition
Top 98.0% on SourcePulse
Summary
SVD-LLM addresses the challenge of compressing large language models (LLMs) by employing Singular Value Decomposition (SVD). It targets researchers and practitioners seeking to reduce model size and computational cost while maintaining performance, offering a novel truncation-aware approach.
How It Works
The core methodology involves applying truncation-aware SVD to LLM weight matrices. This is complemented by data whitening to prepare weights and sequential low-rank approximation (LoRA) fine-tuning to update compressed parameters. The framework also supports integration with quantization techniques like GPTQ for enhanced compression ratios.
Quick Start & Requirements
transformers package version 4.35.2. Set up a conda environment: conda create -n compress python=3.9, conda activate compress. Clone the repository and install dependencies: pip install -r requirements.txt.bash compress_llama.sh compresses LLaMA-7B and runs evaluations. Step-by-step instructions detail SVD compression (python SVDLLM.py --step 1), LoRA fine-tuning (python LoRA.py), and integration with GPTQ (bash svdllm_gptq.sh). Evaluation scripts are also provided.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
transformers version 4.35.2 may pose integration challenges.2 months ago
Inactive
evanmiller
Vahe1994
mit-han-lab