finetrainers  by huggingface

Library for diffusion model training

Created 11 months ago
1,277 stars

Top 31.1% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides scalable and memory-optimized training for diffusion models, targeting researchers and practitioners working with advanced AI video generation. It aims to make complex training algorithms more accessible and efficient.

How It Works

Finetrainers supports distributed training (DDP, FSDP-2, HSDP) and memory-efficient single-GPU training. It offers LoRA and full-rank finetuning, conditional control training, and multiple attention backends (flash, flex, sage, xformers). The library features flexible dataset handling, including combined image/video, chainable local/remote datasets, and multi-resolution bucketing, with memory-efficient precomputation options.

Quick Start & Requirements

  • Install: pip install -r requirements.txt and pip install git+https://github.com/huggingface/diffusers.
  • Recommended: PyTorch 2.5.1 or above.
  • For reproducible training, use the environment specified in environment.md.
  • Stable releases: git fetch --all --tags && git checkout tags/v0.2.0.
  • Example training scripts and datasets are available in the repository.
  • Documentation: docs/models, examples/training, docs/args.

Highlighted Details

  • Supports LTX-Video, HunyuanVideo, CogVideoX, Wan, CogView4, and Flux models.
  • Offers LoRA and full-rank finetuning, with specific VRAM requirements detailed for various models and resolutions.
  • Integrates torch.compile and multiple attention providers for performance optimization.
  • Includes tooling for curating video datasets and supports naive FP8 weight-casting training for reduced VRAM usage.

Maintenance & Community

The project is actively developed with frequent updates. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

The library builds upon and integrates with various open-source libraries. The specific license for finetrainers itself is not explicitly stated in the README, but its reliance on other libraries implies compatibility with their respective licenses.

Limitations & Caveats

The main development branch is noted as unstable. Some model support (e.g., Wan, CogView4, Flux) has TODO entries for VRAM requirements. HunyuanVideo full finetuning is listed as OOM (Out Of Memory) for the tested configuration.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
24 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), and
1 more.

diffusion by mosaicml

0%
707
Diffusion model training code
Created 2 years ago
Updated 8 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
5 more.

ai-toolkit by ostris

0.9%
6k
Training toolkit for finetuning diffusion models
Created 2 years ago
Updated 13 hours ago
Feedback? Help us improve.