C-Optim by kyleliang919

Improving transformer training with a single line of code

Created 1 year ago

398 stars

Top 72.6% on SourcePulse

1 Expert Loves This Project

rwightman

Author of timm; CV at Hugging Face

Project Summary

This repository introduces Cautious Optimizers (C-Optim), a novel modification to momentum-based optimizers that enhances training speed and stability in deep learning models. It targets researchers and engineers working on large-scale model pretraining and fine-tuning, offering a simple, one-line code change to improve performance.

How It Works

C-Optim applies a single-line modification to existing PyTorch optimizers, such as AdamW and Lion, creating variants like C-AdamW and C-Lion. This modification is theoretically shown to preserve Adam's Hamiltonian function and convergence guarantees under Lyapunov analysis. This approach yields a new family of optimizers, with the simplest variant demonstrating significant speed-ups.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires PyTorch.
Examples provided for Llama, MAE, Qwen2.5, and PPO training.
Links: Paper, Hugging Face integration

Highlighted Details

Achieves up to 1.47x speed-up on Llama and MAE pretraining.
Integrated into Hugging Face's pytorch-image-models.
Supports PPO for reinforcement learning tasks.
Post-training experiments on Qwen2.5 models are available.

Maintenance & Community

Official implementation released November 2024.
Paper available on arXiv.
Active development with recent updates in January 2025.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

The project is pre-release with a paper published in late 2024, indicating ongoing development and potential for future changes. No specific limitations or unsupported platforms are mentioned in the README.

Health Check

Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

6 stars in the last 30 days

Explore Similar Projects

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI).

COAT by NVlabs

FP8 training framework for memory efficiency

Created 1 year ago

Updated 5 months ago

Sophia by kyegomez

Optimizer for language model pre-training, claiming 2x speedup over Adam

Created 2 years ago

Updated 1 year ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI) and

Stas Bekman

Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

Efficiently train foundation models with PyTorch

Created 1 year ago

Updated 1 month ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI).

MARS by AGI-Arena

Optimization framework for training large models

Created 1 year ago

Updated 2 months ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm) and

Jesse Clark

Jesse Clark(Cofounder of Marqo).

grokfast by ironjr

Research paper for accelerated grokking via gradient amplification

Created 1 year ago

Updated 1 year ago

Starred by

Luca Antiga

Luca Antiga(CTO of Lightning AI),

William Falcon

William Falcon(Founder of Lightning AI), and

4 more.

lightning-thunder by Lightning-AI

PyTorch compiler for model optimization via source-to-source transformation

Created 1 year ago

Updated 1 day ago

Starred by

Bryan Helmig

Bryan Helmig(Cofounder of Zapier) and

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect).

simple_GRPO by lsdefine

GRPO implementation for reproducing LLM reasoning, like r1

Created 11 months ago

Updated 1 month ago

Starred by

Robin Rombach

Robin Rombach(Cofounder of Black Forest Labs) and

Luca Antiga

Luca Antiga(CTO of Lightning AI).

hamiltonian-nn by greydanus

Research paper code for Hamiltonian Neural Networks

Created 6 years ago

Updated 4 years ago

Starred by

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect),

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai), and

1 more.

deep-learning-pytorch-huggingface by philschmid

Tutorials for deep learning with PyTorch and Hugging Face libraries

Created 3 years ago

Updated 10 months ago

Starred by

Logan Kilpatrick

Logan Kilpatrick(Product Lead on Google AI Studio),

Paras Jain

Paras Jain(Cofounder of Genmo), and

7 more.

catalyst by catalyst-team

PyTorch framework for accelerated deep learning R&D

Created 7 years ago

Updated 6 months ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Piotr Dąbkowski

Piotr Dąbkowski(Cofounder of ElevenLabs), and

13 more.

optimum by huggingface

Hardware optimization tools for Transformers, Diffusers, etc

Created 4 years ago

Updated 3 weeks ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Lilian Weng

Lilian Weng(Cofounder of Thinking Machines Lab), and

99 more.

transformers by huggingface

ML library for pretrained model inference and training

Created 7 years ago

Updated 2 days ago

Feedback? Help us improve.