sd-webui-stablesr by pkuliyi2015

WebUI extension for high-fidelity image upscaling

Created 2 years ago

1,097 stars

Top 34.5% on SourcePulse

Project Summary

StableSR for Stable Diffusion WebUI provides ultra high-quality image upscaling, targeting users of the Automatic1111 Stable Diffusion WebUI. It aims to deliver detailed upscaling comparable to closed-source solutions, while being mindful of VRAM consumption and color fidelity.

How It Works

This extension integrates the StableSR super-resolution method into the Automatic1111 WebUI. It leverages a diffusion prior for image upscaling, designed to preserve facial identity and details across various image types. The implementation includes optimizations for reduced VRAM usage compared to the original StableSR and offers a "Wavelet Color Fix" post-processing technique to mitigate color shifts common in tiling-based upscaling.

Quick Start & Requirements

Installation: Install via the "Extensions" tab in Automatic1111 WebUI (either "Available" or "Install from URL" with https://github.com/pkuliyi2015/sd-webui-stablesr.git).
Prerequisites:
- Automatic1111 Stable Diffusion WebUI.
- Stable Diffusion V2.1 EMA checkpoint (512 or 768 version).
- StableSR module (~400MB) downloaded and placed in extensions/sd-webui-stablesr/models/.
- Optional: Tiled Diffusion & VAE extension, VQGAN VAE (~700MB) placed in models/VAE.
Setup Time: Downloading models and extensions may take several minutes depending on internet speed.
Links: Project Page, Official Repository, Paper on arXiv

Highlighted Details

Offers both SD 2.1 768 and SD 2.1 512 versions, with the 768 version noted for fewer artifacts.
Recommends using Tiled Diffusion & VAE for larger resolutions and lower VRAM usage (<12GB).
Includes a "Wavelet Color Fix" for improved color matching, a significant improvement over the official AdaIN method.
Negative prompts are highlighted as crucial for enhancing details.

Maintenance & Community

The project is a migration of the original StableSR project.
No specific community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

Licensed under S-Lab License 1.0.
Strictly prohibits commercial use of the code and checkpoint. Outcome images are also prohibited from commercial use unless explicit permission is obtained via email.

Limitations & Caveats

Commercial use is strictly prohibited by the S-Lab License 1.0.
Results may differ from official examples due to sampler differences and a modified VQVAE decoder (lacking CFW component for large images).
SDP attention optimization may cause OOM errors; xformers is recommended as an alternative.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days