ToMe for SD offers a method to accelerate Stable Diffusion inference by merging redundant tokens within transformer blocks, reducing computational load. This approach is designed for users of Stable Diffusion models seeking faster generation times and lower memory consumption without requiring model retraining.
How It Works
ToMe for SD applies a novel token merging strategy to Stable Diffusion's transformer components. By intelligently merging tokens, it reduces the number of operations the model performs, leading to significant speedups and memory savings. This method is designed to minimize quality degradation, even with aggressive merging ratios, and can be combined with other optimization techniques like xFormers.
Quick Start & Requirements
- Install via pip:
pip install tomesd
- Requires PyTorch >= 1.12.1.
- Supports Stable Diffusion v1, v2, Latent Diffusion, and Diffusers pipelines.
- Setup is minimal, involving a simple Python patch.
- Official documentation and examples are available.
Highlighted Details
- Achieves up to 2x speedup and ~5.7x less memory usage with 60% token merging.
- Minimal quality loss, with FID scores remaining close to baseline even at high merge ratios.
- Implemented in pure Python, requiring no CUDA compilation.
- Compatible with efficient transformer implementations like xFormers and Flash Attention.
Maintenance & Community
- The project is actively maintained by Daniel Bolya and Judy Hoffman.
- Available via pip since April 2023, with ongoing support for Diffusers.
- Citations are provided for both the Stable Diffusion specific work and the original ToMe paper.
Licensing & Compatibility
- The repository does not explicitly state a license in the README.
- Compatibility with commercial or closed-source applications is not specified.
Limitations & Caveats
- The process is lossy, meaning generated images will differ from the original.
- Initial inference speed-up may be less pronounced due to PyTorch graph setup.
- Consistent results across batches require manual seed setting due to the random nature of the merging process.