StableSR by IceClear

Research paper for real-world image super-resolution using diffusion prior

Created 2 years ago

2,629 stars

Top 17.6% on SourcePulse

Project Summary

StableSR addresses real-world image super-resolution by leveraging diffusion models, offering a powerful solution for researchers and users seeking high-fidelity upscaling. It aims to improve image quality beyond traditional methods by integrating diffusion priors into the super-resolution process.

How It Works

StableSR utilizes a diffusion model as a prior to guide the super-resolution process. It works by generating high-resolution images conditioned on low-resolution inputs, effectively "hallucinating" plausible details. This approach allows for more realistic and artifact-free results compared to methods relying solely on interpolation or single-stage generative models. The use of diffusion models enables arbitrary upscaling factors and supports advanced features like negative prompts for finer control.

Quick Start & Requirements

Install: Clone the repository, create a conda environment using environment.yaml, install xformers, taming-transformers, clip, and the package itself.
Prerequisites: PyTorch 1.12.1, CUDA 11.7, pytorch-lightning 1.4.2. xformers (0.0.16) is optional but recommended.
Resources: Running 128x128 to 512x512 requires ~8.9GB GPU memory. Larger resolutions or tiling may require 10GB+.
Demos: Online demos are available on Hugging Face, OpenXLab, and Replicate. A Colab demo is also provided.

Highlighted Details

Accepted by IJCV 2024.
Supports StableSR with SD-Turbo for faster inference.
Offers ComfyUI integration for workflow management.
Includes training scripts for CFW and FaceSR.
Provides test sets for easy comparison with paper results.

Maintenance & Community

The project is actively maintained, with recent updates including SD-Turbo support and ComfyUI integration. Links to demos on Hugging Face, OpenXLab, and Replicate are provided. Contact is available via email.

Licensing & Compatibility

Licensed under NTU S-Lab License 1.0. Redistribution and use must follow this license.

Limitations & Caveats

Testing on arbitrary sizes without tiling requires over 10GB GPU memory. Tiled inference for arbitrary sizes requires at least 18GB, with potential border artifacts depending on tile size and stride. FaceSR requires pre-generated reference images from models like CodeFormer.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days