PyTorch library for low-rank adaptation (LoRA) of LLMs
Top 4.0% on sourcepulse
This repository provides loralib
, a Python package implementing Low-Rank Adaptation (LoRA) for large language models. It enables efficient fine-tuning by injecting trainable low-rank matrices into pre-trained models, significantly reducing the number of trainable parameters and storage requirements without introducing inference latency. The target audience includes researchers and engineers working with large NLP models who need to adapt them to specific tasks efficiently.
How It Works
LoRA reduces trainable parameters by decomposing weight updates into two smaller, low-rank matrices. This approach drastically cuts down the memory footprint for fine-tuning and allows for rapid task switching. The library integrates seamlessly with PyTorch models, offering drop-in replacements for nn.Linear
, nn.Embedding
, and nn.Conv2d
layers.
Quick Start & Requirements
pip install loralib
or pip install git+https://github.com/microsoft/LoRA
nn.Linear
, nn.Embedding
, and nn.Conv2d
.Highlighted Details
lora.MergedLinear
for handling fused layers (e.g., QKV projections) and optional bias training.model.eval()
, eliminating inference latency.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
snapshot-9-15-2021
branch.lora.MergedLinear
or manually splitting them.7 months ago
Inactive