attention-map-diffusers by wooyeolbaek

Attention map tool for Hugging Face Diffusers

Created 2 years ago

391 stars

Top 73.6% on SourcePulse

Project Summary

This repository provides tools for extracting and visualizing cross-attention maps from Hugging Face Diffusers pipelines, targeting researchers and developers working with diffusion models. It enables deeper understanding of how models interpret prompts by highlighting spatial relationships between text tokens and image features.

How It Works

The library injects hooks into compatible Diffusers pipelines to capture cross-attention map data during the generation process. It then processes these maps, allowing for saving and visualization based on specific timesteps and layers, offering granular insight into the diffusion model's internal workings.

Quick Start & Requirements

Install via pip: pip install attention_map_diffusers or pip install -e .
Requires Python and PyTorch.
GPU with CUDA is recommended for performance.
Compatible with Hugging Face Diffusers v0.32.0 and later.
Supports models like Stable Diffusion 3.5, Flux-dev, Flux-schnell, Stable Diffusion 3, Stable Diffusion XL, and Stable Diffusion 2.1.
Official documentation and examples are available in the repository.

Highlighted Details

Compatible with recent models including Stable Diffusion 3.5 and Flux variants.
Supports batch operations for Stable Diffusion 3 (with caveats on CPU memory).
Allows saving attention maps based on specific timesteps and layers.
Enables CPU offloading to save VRAM.

Maintenance & Community

Actively updated, with recent compatibility additions for SD3.5 and Flux models.
Issue tracker available on GitHub for bug reports and feature requests.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README.

Limitations & Caveats

Batch operations are not recommended for SD3 due to potential CPU memory exhaustion.
Compatibility with "Sana" models is planned for a future update.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

1

Star History

8 stars in the last 30 days

Explore Similar Projects

Min-SNR-Diffusion-Training by TiankaiHang

Accelerate diffusion model training

Created 2 years ago

Updated 1 year ago

VLM-Visualizer by zjysteven

Visualizing attention in vision-language models

Created 1 year ago

Updated 1 year ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

inspectus by labmlai

Visualization tool for machine learning models in Jupyter notebooks

Created 1 year ago

Updated 1 year ago

Universal-Guided-Diffusion by arpitbansal297

PyTorch code for universal diffusion guidance

Created 3 years ago

Updated 2 years ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

gated_attention by qiuzh20

Gated attention for LLMs: Non-linearity, sparsity, and attention-sink-free

Created 9 months ago

Updated 2 months ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera),

Thomas Wolf

Thomas Wolf(Cofounder of Hugging Face), and

1 more.

exbert by bhoov

Visual analysis tool for Transformer model representations (research paper)

Created 6 years ago

Updated 2 years ago

Attend-and-Excite by yuval-alaluf

Research paper implementation for text-to-image diffusion models

Created 3 years ago

Updated 2 years ago

daam by castorini

Research paper implementation for interpreting Stable Diffusion models

Created 3 years ago

Updated 1 year ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

FateZero by ChenyangQiQi

Zero-shot video editor (ICCV 2023 Oral) using attention fusion

Created 2 years ago

Updated 2 years ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

custom-diffusion by adobe-research

Text-to-image fine-tuning research paper

Created 3 years ago

Updated 2 months ago

Machine-Learning by DorsaRoh

ML implementations from scratch, using NumPy

Created 1 year ago

Updated 6 months ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

9 more.

bertviz by jessevig

Interactive tool for visualizing attention in Transformer language models

Created 7 years ago

Updated 1 month ago

Feedback? Help us improve.