ai-toolkit  by ostris

Training toolkit for finetuning diffusion models

created 2 years ago
5,397 stars

Top 9.5% on sourcepulse

GitHubView on GitHub
Project Summary

This toolkit provides a comprehensive suite for fine-tuning diffusion models, targeting users who want to train image and video models on consumer-grade hardware. It offers both a GUI and CLI, aiming for ease of use with extensive features for model training.

How It Works

The toolkit leverages PyTorch for its core operations and supports various training techniques like LoRA and LoKr. It includes features for dataset preparation, allowing automatic resizing and aspect ratio handling, and enables fine-grained control over which model layers are trained using only_if_contains and ignore_if_contains network arguments.

Quick Start & Requirements

  • Installation: Clone the repository, create a Python virtual environment, install PyTorch (cu126), and then install requirements.
  • Prerequisites: Python >3.10, Nvidia GPU with sufficient VRAM, Node.js >18 (for UI).
  • UI: Run npm run build_and_start in the ui directory. Access at http://localhost:8675.
  • Auth Token: Set AI_TOOLKIT_AUTH environment variable to secure the UI.
  • Documentation: Tutorials and examples are available for FLUX.1 training, RunPod, and Modal.

Highlighted Details

  • Supports training on consumer-grade hardware, with specific tutorials for 24GB VRAM GPUs.
  • Offers both a web-based UI and a CLI for flexible interaction.
  • Includes advanced features like training specific layers and supporting LoKr network type.
  • Provides examples for deployment on platforms like RunPod and Modal.

Maintenance & Community

  • The project is actively maintained, with the last update on 2025-04-22.
  • Support and community interaction are primarily directed to a Discord server.

Licensing & Compatibility

  • The base toolkit appears to be permissively licensed, but specific models like FLUX.1-dev have a non-commercial license, which is inherited by trained models. FLUX.1-schnell is Apache 2.0 licensed.
  • Commercial use is possible with Apache 2.0 licensed models, but requires careful attention to the specific model's license.

Limitations & Caveats

  • Training FLUX.1 requires a minimum of 24GB VRAM, and native Windows support has reported bugs.
  • FLUX.1-dev has a non-commercial license and requires Hugging Face authentication and license acceptance.
  • WebP image format has known issues.
Health Check
Last commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
9
Issues (30d)
23
Star History
821 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

serve by pytorch

0.1%
4k
Serve, optimize, and scale PyTorch models in production
created 5 years ago
updated 3 weeks ago
Starred by Max Howell Max Howell(Author of Homebrew) and Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers).

kohya_ss by bmaltais

0.2%
11k
GUI for Stable Diffusion training scripts
created 2 years ago
updated 1 week ago
Feedback? Help us improve.