This toolkit provides a comprehensive suite for fine-tuning diffusion models, targeting users who want to train image and video models on consumer-grade hardware. It offers both a GUI and CLI, aiming for ease of use with extensive features for model training.
How It Works
The toolkit leverages PyTorch for its core operations and supports various training techniques like LoRA and LoKr. It includes features for dataset preparation, allowing automatic resizing and aspect ratio handling, and enables fine-grained control over which model layers are trained using only_if_contains
and ignore_if_contains
network arguments.
Quick Start & Requirements
- Installation: Clone the repository, create a Python virtual environment, install PyTorch (cu126), and then install requirements.
- Prerequisites: Python >3.10, Nvidia GPU with sufficient VRAM, Node.js >18 (for UI).
- UI: Run
npm run build_and_start
in the ui
directory. Access at http://localhost:8675
.
- Auth Token: Set
AI_TOOLKIT_AUTH
environment variable to secure the UI.
- Documentation: Tutorials and examples are available for FLUX.1 training, RunPod, and Modal.
Highlighted Details
- Supports training on consumer-grade hardware, with specific tutorials for 24GB VRAM GPUs.
- Offers both a web-based UI and a CLI for flexible interaction.
- Includes advanced features like training specific layers and supporting LoKr network type.
- Provides examples for deployment on platforms like RunPod and Modal.
Maintenance & Community
- The project is actively maintained, with the last update on 2025-04-22.
- Support and community interaction are primarily directed to a Discord server.
Licensing & Compatibility
- The base toolkit appears to be permissively licensed, but specific models like FLUX.1-dev have a non-commercial license, which is inherited by trained models. FLUX.1-schnell is Apache 2.0 licensed.
- Commercial use is possible with Apache 2.0 licensed models, but requires careful attention to the specific model's license.
Limitations & Caveats
- Training FLUX.1 requires a minimum of 24GB VRAM, and native Windows support has reported bugs.
- FLUX.1-dev has a non-commercial license and requires Hugging Face authentication and license acceptance.
- WebP image format has known issues.