LLM pruning tool for reducing model size and accelerating training
Top 88.4% on sourcepulse
LLMPruner is a tool designed to reduce the size and memory footprint of large language models (LLMs) by pruning their vocabulary. It targets researchers and practitioners working with multilingual LLMs who need to fine-tune models for specific language tasks, offering a way to improve training efficiency and reduce hardware requirements.
How It Works
LLMPruner addresses the significant memory overhead caused by large vocabularies in multilingual LLMs. It achieves this by identifying and removing infrequently used tokens, retaining only those essential for specific language tasks (e.g., Chinese and English). This targeted pruning reduces model parameters and memory usage without sacrificing the pre-trained knowledge, enabling efficient fine-tuning on less demanding hardware.
Quick Start & Requirements
transformers
library.transformers
library.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README focuses on Bloom models and does not specify support for other LLM architectures. The licensing terms for commercial use are not provided, which may be a consideration for adoption.
2 years ago
1 week