sd-webui-stablesr  by pkuliyi2015

WebUI extension for high-fidelity image upscaling

Created 2 years ago
1,094 stars

Top 34.9% on SourcePulse

GitHubView on GitHub
Project Summary

StableSR for Stable Diffusion WebUI provides ultra high-quality image upscaling, targeting users of the Automatic1111 Stable Diffusion WebUI. It aims to deliver detailed upscaling comparable to closed-source solutions, while being mindful of VRAM consumption and color fidelity.

How It Works

This extension integrates the StableSR super-resolution method into the Automatic1111 WebUI. It leverages a diffusion prior for image upscaling, designed to preserve facial identity and details across various image types. The implementation includes optimizations for reduced VRAM usage compared to the original StableSR and offers a "Wavelet Color Fix" post-processing technique to mitigate color shifts common in tiling-based upscaling.

Quick Start & Requirements

  • Installation: Install via the "Extensions" tab in Automatic1111 WebUI (either "Available" or "Install from URL" with https://github.com/pkuliyi2015/sd-webui-stablesr.git).
  • Prerequisites:
    • Automatic1111 Stable Diffusion WebUI.
    • Stable Diffusion V2.1 EMA checkpoint (512 or 768 version).
    • StableSR module (~400MB) downloaded and placed in extensions/sd-webui-stablesr/models/.
    • Optional: Tiled Diffusion & VAE extension, VQGAN VAE (~700MB) placed in models/VAE.
  • Setup Time: Downloading models and extensions may take several minutes depending on internet speed.
  • Links: Project Page, Official Repository, Paper on arXiv

Highlighted Details

  • Offers both SD 2.1 768 and SD 2.1 512 versions, with the 768 version noted for fewer artifacts.
  • Recommends using Tiled Diffusion & VAE for larger resolutions and lower VRAM usage (<12GB).
  • Includes a "Wavelet Color Fix" for improved color matching, a significant improvement over the official AdaIN method.
  • Negative prompts are highlighted as crucial for enhancing details.

Maintenance & Community

  • The project is a migration of the original StableSR project.
  • No specific community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • Licensed under S-Lab License 1.0.
  • Strictly prohibits commercial use of the code and checkpoint. Outcome images are also prohibited from commercial use unless explicit permission is obtained via email.

Limitations & Caveats

  • Commercial use is strictly prohibited by the S-Lab License 1.0.
  • Results may differ from official examples due to sampler differences and a modified VQVAE decoder (lacking CFW component for large images).
  • SDP attention optimization may cause OOM errors; xformers is recommended as an alternative.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
1 more.

Sana by NVlabs

0.4%
4k
Image synthesis research paper using a linear diffusion transformer
Created 11 months ago
Updated 5 days ago
Starred by Robin Huang Robin Huang(Cofounder of Comfy Org), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
17 more.

stablediffusion by Stability-AI

0.1%
42k
Latent diffusion model for high-resolution image synthesis
Created 2 years ago
Updated 2 months ago
Feedback? Help us improve.