accelerate by huggingface

PyTorch training helper for distributed execution

Created 5 years ago

9,427 stars

Top 5.4% on SourcePulse

22 Experts Love This Project

clmnt

Clement Delangue

Cofounder of Hugging Face

chiphuyen

Author of "AI Engineering", "Designing Machine Learning Systems"

eugeneyan

AI Scientist at AWS

pgarbacki

Cofounder of Fireworks AI

and 18 more!

Project Summary

Hugging Face Accelerate simplifies distributed PyTorch training and inference across diverse hardware configurations, including multi-CPU, multi-GPU, and TPUs. It targets PyTorch users who want to leverage distributed computing and mixed precision without extensive boilerplate code modifications, enabling faster and more scalable model development.

How It Works

Accelerate acts as a thin wrapper around PyTorch's distributed capabilities, abstracting away device placement and distributed communication logic. By initializing an Accelerator object and calling accelerator.prepare() on models, optimizers, and data loaders, users can seamlessly transition their existing PyTorch training scripts to run on various distributed setups and mixed precision formats (FP16, BF16, FP8) with minimal code changes. This approach preserves the user's control over the training loop while handling the complexities of distributed execution.

Quick Start & Requirements

Install via pip: pip install accelerate
Requires Python 3.8+ and PyTorch 1.10.0+.
Official documentation: https://huggingface.co/docs/accelerate
Examples: https://github.com/huggingface/accelerate/tree/main/examples

Highlighted Details

Supports single/multi-CPU, single/multi-GPU, and TPU configurations.
Integrates automatic mixed precision (FP16, BF16, FP8) with Transformer Engine or MS-AMP.
Experimental support for DeepSpeed, PyTorch Fully Sharded Data Parallel (FSDP), and Megatron-LM.
Provides an optional CLI tool (accelerate config) for easy environment setup and script launching.
Offers notebook_launcher for distributed training within notebooks (e.g., Colab, Kaggle).

Maintenance & Community

Developed by Hugging Face with contributions from numerous individuals.
Widely integrated into other popular libraries like transformers, fastai, and stable-diffusion-webui.
Community support channels are available via Hugging Face's platforms.

Licensing & Compatibility

Apache License 2.0.
Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

DeepSpeed, FSDP, and Megatron-LM integrations are marked as experimental.
Requires users to write their own training loops; not a high-level framework.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

15

Issues (30d)

14

Star History

73 stars in the last 30 days

Explore Similar Projects

MegaDLMs by JinjieNi

Accelerate diffusion language model training at any scale with GPU optimization

Created 2 months ago

Updated 2 months ago

Starred by

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect),

Wing Lian

Wing Lian(Founder of Axolotl AI), and

1 more.

varuna by microsoft

Tool for efficient large DNN model training on commodity hardware

Created 4 years ago

Updated 1 year ago

distribuuuu by BIGBALLON

PyTorch distributed training framework

Created 5 years ago

Updated 1 year ago

MINI_LLM by jiahe7ay

LLM pre-training reproduction repo for experimentation

Created 1 year ago

Updated 8 months ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

BMTrain by OpenBMB

Training toolkit for large AI models

Created 4 years ago

Updated 2 months ago

Starred by

Amanpreet Singh

Amanpreet Singh(Cofounder of Contextual AI),

Johannes Hagemann

Johannes Hagemann(Cofounder of Prime Intellect), and

2 more.

Megatron-DeepSpeed by bigscience-workshop

Transformer LM research repo for BERT & GPT-2 training at scale

Created 4 years ago

Updated 1 year ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Piotr Dąbkowski

Piotr Dąbkowski(Cofounder of ElevenLabs), and

13 more.

optimum by huggingface

Hardware optimization tools for Transformers, Diffusers, etc

Created 4 years ago

Updated 3 weeks ago

Starred by

Aravind Srinivas

Aravind Srinivas(Cofounder of Perplexity),

Johannes Hagemann

Johannes Hagemann(Cofounder of Prime Intellect), and

21 more.

apex by NVIDIA

PyTorch extension for streamlined mixed precision & distributed training

Created 7 years ago

Updated 2 weeks ago

Starred by

Aravind Srinivas

Aravind Srinivas(Cofounder of Perplexity),

François Chollet

François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), and

22 more.

horovod by horovod

Distributed training framework for TF, Keras, PyTorch, and MXNet

Created 8 years ago

Updated 1 month ago

Starred by

Peter Norvig

Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google),

Alexey Milovidov

Alexey Milovidov(Cofounder of Clickhouse), and

29 more.

llm.c by karpathy

LLM training in pure C/CUDA, no PyTorch needed

Created 1 year ago

Updated 6 months ago

Starred by

Albert Gu

Albert Gu(Cofounder of Cartesia; Professor at CMU),

Luca Soldaini

Luca Soldaini(Research Scientist at Ai2), and

34 more.

pytorch-lightning by Lightning-AI

Deep learning framework for pretraining, finetuning, and deploying AI models

Created 6 years ago

Updated 3 days ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Daniel Gross

Daniel Gross(Cofounder of Safe Superintelligence), and

46 more.

nanoGPT by karpathy

Minimalist repo for training/finetuning GPT models

Created 3 years ago

Updated 1 month ago

Feedback? Help us improve.