finetrainers by huggingface

Library for diffusion model training

Created 1 year ago

1,319 stars

Top 30.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

This library provides scalable and memory-optimized training for diffusion models, targeting researchers and practitioners working with advanced AI video generation. It aims to make complex training algorithms more accessible and efficient.

How It Works

Finetrainers supports distributed training (DDP, FSDP-2, HSDP) and memory-efficient single-GPU training. It offers LoRA and full-rank finetuning, conditional control training, and multiple attention backends (flash, flex, sage, xformers). The library features flexible dataset handling, including combined image/video, chainable local/remote datasets, and multi-resolution bucketing, with memory-efficient precomputation options.

Quick Start & Requirements

Install: pip install -r requirements.txt and pip install git+https://github.com/huggingface/diffusers.
Recommended: PyTorch 2.5.1 or above.
For reproducible training, use the environment specified in environment.md.
Stable releases: git fetch --all --tags && git checkout tags/v0.2.0.
Example training scripts and datasets are available in the repository.
Documentation: docs/models, examples/training, docs/args.

Highlighted Details

Supports LTX-Video, HunyuanVideo, CogVideoX, Wan, CogView4, and Flux models.
Offers LoRA and full-rank finetuning, with specific VRAM requirements detailed for various models and resolutions.
Integrates torch.compile and multiple attention providers for performance optimization.
Includes tooling for curating video datasets and supports naive FP8 weight-casting training for reduced VRAM usage.

Maintenance & Community

The project is actively developed with frequent updates. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

The library builds upon and integrates with various open-source libraries. The specific license for finetrainers itself is not explicitly stated in the README, but its reliance on other libraries implies compatibility with their respective licenses.

Limitations & Caveats

The main development branch is noted as unstable. Some model support (e.g., Wan, CogView4, Flux) has TODO entries for VRAM requirements. HunyuanVideo full finetuning is listed as OOM (Out Of Memory) for the tested configuration.

Health Check

Last Commit

7 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days