multidiffusion-upscaler-for-automatic1111  by pkuliyi2015

Extension for generating/upscaling large images in Stable Diffusion WebUI

Created 2 years ago
4,973 stars

Top 10.0% on SourcePulse

GitHubView on GitHub
Project Summary

This extension addresses the challenge of generating and upscaling high-resolution images (2K+) with limited VRAM (<=6GB) for Stable Diffusion users. It provides advanced tiling techniques for both diffusion and VAE processes, enabling users to create or enhance large images that would otherwise be impossible on lower-end hardware.

How It Works

The extension implements several state-of-the-art tiling techniques, including Tiled Diffusion and its variants (MultiDiffusion, Demofusion), alongside original Tiled VAE and Tiled Noise Inversion methods. This approach breaks down large images into smaller, manageable tiles that are processed sequentially. By tiling the diffusion and VAE operations, the extension significantly reduces VRAM requirements, allowing for the generation and upscaling of ultra-large images on consumer-grade GPUs.

Quick Start & Requirements

  • Install: Install as a Stable Diffusion WebUI extension.
  • Prerequisites: Stable Diffusion WebUI, Python 3.10.6, PyTorch 2.0.1, CUDA 11.8 or 12.1.
  • Documentation: Detailed documentation and examples are available on the wiki. A quickstart tutorial is also provided.

Highlighted Details

  • Supports ultra-large image generation (txt2img) and upscaling (img2img).
  • Features Regional Prompt Control for localized prompt application.
  • Integrates with ControlNet, StableSR, SDXL, and Demofusion.
  • Offers original Tiled VAE and Tiled Noise Inversion methods.

Maintenance & Community

The project is actively maintained. Further details on community engagement or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

Licensed under CC BY-NC-SA 4.0. Versions after March 28, 2023, prohibit commercial sales of the code itself, though derived artworks are not restricted. This license may restrict integration into closed-source commercial applications.

Limitations & Caveats

The CC BY-NC-SA 4.0 license restricts commercial use of the code. Users should be aware of the specific date cutoff for commercial restrictions.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
1 more.

Sana by NVlabs

0.4%
4k
Image synthesis research paper using a linear diffusion transformer
Created 11 months ago
Updated 5 days ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Rodrigo Nader Rodrigo Nader(Cofounder of Langflow), and
1 more.

DiffSynth-Studio by modelscope

0.9%
10k
Open-source project for diffusion model exploration
Created 1 year ago
Updated 15 hours ago
Starred by Robin Huang Robin Huang(Cofounder of Comfy Org), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
17 more.

stablediffusion by Stability-AI

0.1%
42k
Latent diffusion model for high-resolution image synthesis
Created 2 years ago
Updated 2 months ago
Feedback? Help us improve.