T2I-Adapter  by TencentARC

T2I-Adapter for controllable text-to-image diffusion models (SD-XL)

created 2 years ago
3,725 stars

Top 13.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides T2I-Adapter, a method for enhancing controllability in text-to-image diffusion models, specifically Stable Diffusion XL (SDXL). It offers lightweight adapters (around 77M parameters) that can be combined with pre-trained SDXL models to guide image generation using various conditioning inputs like sketches, Canny edges, line art, and depth maps. This approach allows for efficient fine-tuning and enables users to leverage the high-quality generation capabilities of SDXL with precise control.

How It Works

T2I-Adapter introduces small, trainable adapter modules that are injected into the diffusion model's architecture. These adapters process conditioning information (e.g., edge maps, pose skeletons) and fuse it with the text prompt's representation. The core advantage is that the large pre-trained diffusion model (SDXL) remains frozen, while only the adapters are trained. This significantly reduces computational cost and memory requirements for fine-tuning, enabling the addition of new control modalities without retraining the entire model.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt and pip install git+https://github.com/huggingface/diffusers.git@t2iadapterxl
  • Prerequisites: Python >= 3.8, PyTorch >= 2.0.1, controlnet_aux==0.0.7, transformers, accelerate, safetensors. Inference requires at least 15GB of GPU memory.
  • Models: Models are automatically downloaded or can be manually downloaded from a provided URL.
  • Demos & Docs: Huggingface Gradio demos and tutorials are available for various adapters.

Highlighted Details

  • Supports SDXL with lightweight adapters (77M parameters), inheriting SDXL's high-quality generation.
  • Offers adapters for sketch, Canny, lineart, OpenPose, and depth (Midas, Zoe).
  • Enables composable adapters (CoAdapter) for combining multiple control signals.
  • Stability AI's Stable Doodle sketch-to-image tool is based on T2I-Adapter and SDXL.

Maintenance & Community

The project is a collaboration between Tencent ARC Lab and Hugging Face. Updates are regularly posted, including the integration of SDXL support and new adapter types. Links to Hugging Face demos and tutorials are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. However, the models are hosted on Hugging Face, which typically uses permissive licenses. Compatibility with commercial or closed-source projects would require verification of the specific model licenses.

Limitations & Caveats

The README mentions that some SDXL adapters are still under development and may require further improvement due to limited computing resources. Inference requires a substantial amount of GPU memory (15GB+). The repository was recently shrunk using bfg, which may cause issues for existing clones.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
65 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.