T2ITrainer by lrzjason

Text-to-image training scripts

Created 1 year ago

554 stars

Top 57.7% on SourcePulse

Project Summary

This repository provides a diffusers-based training script for LoRA (Low-Rank Adaptation) and other image generation models, targeting users who want to fine-tune models like Qwen Image, Flux, Kolors, and SD3.5. It simplifies the LoRA training process with a focus on ease of use and configuration flexibility, enabling users to achieve state-of-the-art results with manageable VRAM requirements.

How It Works

T2ITrainer leverages the Hugging Face diffusers library to implement various training scripts. It supports configuration-driven training, allowing users to define parameters like image suffixes, training layouts, and model types in JSON files. This approach streamlines the setup and execution of complex training tasks, such as single-image training, traditional pair training, and mixed-layout training, while optimizing for different GPU VRAM capacities.

Quick Start & Requirements

Installation: Clone the repository and run setup.bat for an automated setup (virtual environment, dependencies, model downloads), or follow manual installation steps.
Prerequisites:
- PyTorch: torch>=2.3.0+cu121 (CUDA 12.1)
- Microsoft Visual C++ Redistributable (for potential DLL errors)
- diffusers library (update to v0.35.0 for Qwen edit support)
- huggingface-cli for model downloads.
VRAM Requirements:
- Qwen Image: 24GB (nf4) / 48GB (bf16)
- Flux Fill, Kontext, SD3.5: 24GB
- Kolors: 11GB
Links: Official GitHub Repository

Highlighted Details

Supports training for Qwen Image, Flux Fill/Kontext, Kolors, and SD3.5 models.
Offers flexible configuration options via JSON files for various training layouts.
Includes automated setup script (setup.bat) for simplified installation.
Provides specific VRAM usage estimates and recommended parameters for different models and GPU sizes.

Maintenance & Community

The project is under active development with frequent updates and changelogs available. Contact and community channels include X (Twitter) @Lrzjason, QQ Group 866612947, and WeChat ID fkdeai. Sponsorship information is available in the sponsor_list.txt file.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification on the licensing terms.

Limitations & Caveats

The project is currently in active development, and stability is not guaranteed, with frequent updates requiring users to check changelogs. UI selection is not yet supported for all training scripts. The "Kolors Black Image Issue" may require using an FP16 Fixed VAE.

Health Check

Last Commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

13 stars in the last 30 days