StableSR for Stable Diffusion WebUI provides ultra high-quality image upscaling, targeting users of the Automatic1111 Stable Diffusion WebUI. It aims to deliver detailed upscaling comparable to closed-source solutions, while being mindful of VRAM consumption and color fidelity.
How It Works
This extension integrates the StableSR super-resolution method into the Automatic1111 WebUI. It leverages a diffusion prior for image upscaling, designed to preserve facial identity and details across various image types. The implementation includes optimizations for reduced VRAM usage compared to the original StableSR and offers a "Wavelet Color Fix" post-processing technique to mitigate color shifts common in tiling-based upscaling.
Quick Start & Requirements
- Installation: Install via the "Extensions" tab in Automatic1111 WebUI (either "Available" or "Install from URL" with
https://github.com/pkuliyi2015/sd-webui-stablesr.git
).
- Prerequisites:
- Automatic1111 Stable Diffusion WebUI.
- Stable Diffusion V2.1 EMA checkpoint (512 or 768 version).
- StableSR module (~400MB) downloaded and placed in
extensions/sd-webui-stablesr/models/
.
- Optional: Tiled Diffusion & VAE extension, VQGAN VAE (~700MB) placed in
models/VAE
.
- Setup Time: Downloading models and extensions may take several minutes depending on internet speed.
- Links: Project Page, Official Repository, Paper on arXiv
Highlighted Details
- Offers both SD 2.1 768 and SD 2.1 512 versions, with the 768 version noted for fewer artifacts.
- Recommends using Tiled Diffusion & VAE for larger resolutions and lower VRAM usage (<12GB).
- Includes a "Wavelet Color Fix" for improved color matching, a significant improvement over the official AdaIN method.
- Negative prompts are highlighted as crucial for enhancing details.
Maintenance & Community
- The project is a migration of the original StableSR project.
- No specific community links (Discord/Slack) or roadmap are provided in the README.
Licensing & Compatibility
- Licensed under S-Lab License 1.0.
- Strictly prohibits commercial use of the code and checkpoint. Outcome images are also prohibited from commercial use unless explicit permission is obtained via email.
Limitations & Caveats
- Commercial use is strictly prohibited by the S-Lab License 1.0.
- Results may differ from official examples due to sampler differences and a modified VQVAE decoder (lacking CFW component for large images).
- SDP attention optimization may cause OOM errors; xformers is recommended as an alternative.