SVD-LLM by AIoT-MLSys-Lab

Compressing LLMs with Singular Value Decomposition

Created 1 year ago

271 stars

Top 95.1% on SourcePulse

Project Summary

Summary

SVD-LLM addresses the challenge of compressing large language models (LLMs) by employing Singular Value Decomposition (SVD). It targets researchers and practitioners seeking to reduce model size and computational cost while maintaining performance, offering a novel truncation-aware approach.

How It Works

The core methodology involves applying truncation-aware SVD to LLM weight matrices. This is complemented by data whitening to prepare weights and sequential low-rank approximation (LoRA) fine-tuning to update compressed parameters. The framework also supports integration with quantization techniques like GPTQ for enhanced compression ratios.

Quick Start & Requirements

Installation: Requires Python 3.9 (newer versions are incompatible) and transformers package version 4.35.2. Set up a conda environment: conda create -n compress python=3.9, conda activate compress. Clone the repository and install dependencies: pip install -r requirements.txt.
Execution: A quick example bash compress_llama.sh compresses LLaMA-7B and runs evaluations. Step-by-step instructions detail SVD compression (python SVDLLM.py --step 1), LoRA fine-tuning (python LoRA.py), and integration with GPTQ (bash svdllm_gptq.sh). Evaluation scripts are also provided.
Resources: Links to ICLR 2025 and NAACL 2025 papers are available. The c4 dataset is used for evaluation.

Highlighted Details

Presents work accepted at ICLR 2025 and NAACL 2025.
Supports compression ratios up to 0.3 (30%), with a quick example targeting 20%.
Evaluates compressed models on perplexity and efficiency metrics.
Offers integration with GPTQ for further model size reduction.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord/Slack), or roadmap are provided in the README.

Licensing & Compatibility

The README does not specify a software license. This omission requires clarification for adoption decisions, especially concerning commercial use or derivative works.

Limitations & Caveats

Strict dependency on Python 3.9 and transformers version 4.35.2 may pose integration challenges.
The absence of a declared license is a significant blocker for determining compatibility and usage rights.
The README does not detail hardware requirements beyond what might be inferred for LLM operations.

Health Check

Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days