Discover and explore top open-source AI tools and projects—updated daily.
arcee-aiLLM layer pruning for computational efficiency
Top 100.0% on SourcePulse
Automated identification and pruning of redundant layers in Large Language Models (LLMs) to significantly reduce computational costs during fine-tuning and inference. This project targets engineers and researchers working with LLMs, offering a practical method to achieve substantial resource savings with minimal performance degradation.
How It Works
The core approach involves analyzing layer similarity within an LLM using a specified dataset. By identifying blocks of layers exhibiting high redundancy, particularly in deeper model sections, the project leverages MergeKit to effectively prune these layers. Post-pruning, Parameter-Efficient Fine-Tuning (PEFT) techniques like QLoRA are employed to "heal" the model, recovering any performance loss and maintaining output quality. This strategy capitalizes on empirical findings that deeper LLM layers often contribute less uniquely than previously assumed.
Quick Start & Requirements
The primary workflow begins with computing layer similarity using a script within the compute_block_similarity directory. An example command is:
python layer_similarity.py --model_path "mistralai/Mistral-7B-Instruct-v0.2" \
--dataset "arcee-ai/sec-data-mini" \
--dataset_column "text" \
--batch_size 8 \
--max_length 1024 \
--layers_to_skip 8 \
--dataset_size 4000 \
--dataset_subset "train"
Prerequisites include a pre-trained LLM (e.g., Mistral-7B), a suitable dataset, and libraries such as MergeKit for pruning and PEFT for model healing. The pruned model can be found at arcee-ai/Mistral-7B-Instruct-v0.2-sliced-24-layer.
Highlighted Details
Maintenance & Community
This repository is marked as an "[unofficial]" implementation. No specific community channels, roadmap, or notable contributor information are provided in the README.
Licensing & Compatibility
The README does not explicitly state the project's license. This omission represents a significant caveat for potential adopters, especially concerning commercial use or integration into closed-source projects.
Limitations & Caveats
This is an "unofficial" implementation. The specific license is not stated in the README, posing a potential adoption blocker. Setup requires multiple distinct steps: similarity computation, pruning via MergeKit, and optional PEFT healing. The effectiveness of pruning may vary based on the model architecture and the chosen dataset.
1 year ago
Inactive
evanmiller
huggingface
databricks