finetrainers  by huggingface

Library for diffusion model training

created 10 months ago
1,232 stars

Top 32.6% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides scalable and memory-optimized training for diffusion models, targeting researchers and practitioners working with advanced AI video generation. It aims to make complex training algorithms more accessible and efficient.

How It Works

Finetrainers supports distributed training (DDP, FSDP-2, HSDP) and memory-efficient single-GPU training. It offers LoRA and full-rank finetuning, conditional control training, and multiple attention backends (flash, flex, sage, xformers). The library features flexible dataset handling, including combined image/video, chainable local/remote datasets, and multi-resolution bucketing, with memory-efficient precomputation options.

Quick Start & Requirements

  • Install: pip install -r requirements.txt and pip install git+https://github.com/huggingface/diffusers.
  • Recommended: PyTorch 2.5.1 or above.
  • For reproducible training, use the environment specified in environment.md.
  • Stable releases: git fetch --all --tags && git checkout tags/v0.2.0.
  • Example training scripts and datasets are available in the repository.
  • Documentation: docs/models, examples/training, docs/args.

Highlighted Details

  • Supports LTX-Video, HunyuanVideo, CogVideoX, Wan, CogView4, and Flux models.
  • Offers LoRA and full-rank finetuning, with specific VRAM requirements detailed for various models and resolutions.
  • Integrates torch.compile and multiple attention providers for performance optimization.
  • Includes tooling for curating video datasets and supports naive FP8 weight-casting training for reduced VRAM usage.

Maintenance & Community

The project is actively developed with frequent updates. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

The library builds upon and integrates with various open-source libraries. The specific license for finetrainers itself is not explicitly stated in the README, but its reliance on other libraries implies compatibility with their respective licenses.

Limitations & Caveats

The main development branch is noted as unstable. Some model support (e.g., Wan, CogView4, Flux) has TODO entries for VRAM requirements. HunyuanVideo full finetuning is listed as OOM (Out Of Memory) for the tested configuration.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
5
Star History
139 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.