Improving transformer training with a single line of code
Top 81.6% on sourcepulse
This repository introduces Cautious Optimizers (C-Optim), a novel modification to momentum-based optimizers that enhances training speed and stability in deep learning models. It targets researchers and engineers working on large-scale model pretraining and fine-tuning, offering a simple, one-line code change to improve performance.
How It Works
C-Optim applies a single-line modification to existing PyTorch optimizers, such as AdamW and Lion, creating variants like C-AdamW and C-Lion. This modification is theoretically shown to preserve Adam's Hamiltonian function and convergence guarantees under Lyapunov analysis. This approach yields a new family of optimizers, with the simplest variant demonstrating significant speed-ups.
Quick Start & Requirements
pip install -r requirements.txt
Highlighted Details
pytorch-image-models
.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is pre-release with a paper published in late 2024, indicating ongoing development and potential for future changes. No specific limitations or unsupported platforms are mentioned in the README.
4 days ago
1 week