PEFT pretraining code for ReLoRA research paper
Top 67.0% on sourcepulse
This repository provides the official implementation for ReLoRA, a technique designed to enable high-rank training of large language models through low-rank updates. It is intended for researchers and practitioners looking to improve model training efficiency and performance by effectively integrating low-rank adaptations.
How It Works
ReLoRA integrates existing LoRA parameters back into the main network weights and then resets them. This approach aims to be more flexible than standard LoRA by allowing for more frequent updates and potentially higher effective rank. Key parameters include the reset frequency (--relora
), optimizer reset behavior (--reset_optimizer_on_relora
, --optimizer_magnitude_pruning
), and a cyclical learning rate scheduler (cosine_restarts
) with --cycle_length
.
Quick Start & Requirements
pip install -e .
and pip install flash-attn
.requirements.txt
.Highlighted Details
torchrun
).cosine_restarts
learning rate scheduler for cyclical training.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is presented as the official code for a research paper, and its current state of maintenance and long-term support is not detailed. The README mentions that main.py
will be deleted, recommending torchrun
for single-GPU training.
1 year ago
1 week