AdaLoRA by QingruZhang

PyTorch package for parameter-efficient fine-tuning via adaptive budget allocation

Created 2 years ago

364 stars

Top 77.3% on SourcePulse

Project Summary

AdaLoRA provides a parameter-efficient fine-tuning method that adaptively allocates a budget of trainable parameters across layers. It targets researchers and practitioners seeking to reduce the computational cost and memory footprint of fine-tuning large language models, enabling efficient adaptation of models like DeBERTa and BART.

How It Works

AdaLoRA employs Singular Value Decomposition (SVD) to decompose weight matrices into smaller, trainable low-rank matrices. Its core innovation is the RankAllocator, which dynamically adjusts the rank (number of singular values) for each layer based on its importance during training. This adaptive budget allocation, combined with orthogonality regularization, aims to optimize parameter efficiency and performance.

Quick Start & Requirements

Install via pip install -e loralib/.
Requires PyTorch.
Example usage involves replacing standard nn.Linear with loralib.SVDLinear and using loralib.RankAllocator within the training loop.
Detailed examples for GLUE and NLG tasks are provided in the NLU/ and NLG_QA/ directories.

Highlighted Details

Implemented as loralib/adalora.py.
Integrated into 🤗 PEFT (Parameter-Efficient Fine-Tuning) library.
Demonstrates performance on GLUE benchmark with DeBERTaV3-base.
Includes examples for summarization and question-answering tasks with BART-large and DeBERTaV3-base.

Maintenance & Community

The project's core implementation is merged into the HuggingFace PEFT library.
Issues can be raised in either repository.

Licensing & Compatibility

The repository itself does not explicitly state a license. However, its integration into 🤗 PEFT suggests compatibility with the Apache 2.0 license of PEFT.

Limitations & Caveats

The original repository's license is not specified, which may require clarification for commercial use.
Requires careful integration into existing training loops and model architectures.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days