StableSR  by IceClear

Research paper for real-world image super-resolution using diffusion prior

created 2 years ago
2,504 stars

Top 19.1% on sourcepulse

GitHubView on GitHub
Project Summary

StableSR addresses real-world image super-resolution by leveraging diffusion models, offering a powerful solution for researchers and users seeking high-fidelity upscaling. It aims to improve image quality beyond traditional methods by integrating diffusion priors into the super-resolution process.

How It Works

StableSR utilizes a diffusion model as a prior to guide the super-resolution process. It works by generating high-resolution images conditioned on low-resolution inputs, effectively "hallucinating" plausible details. This approach allows for more realistic and artifact-free results compared to methods relying solely on interpolation or single-stage generative models. The use of diffusion models enables arbitrary upscaling factors and supports advanced features like negative prompts for finer control.

Quick Start & Requirements

  • Install: Clone the repository, create a conda environment using environment.yaml, install xformers, taming-transformers, clip, and the package itself.
  • Prerequisites: PyTorch 1.12.1, CUDA 11.7, pytorch-lightning 1.4.2. xformers (0.0.16) is optional but recommended.
  • Resources: Running 128x128 to 512x512 requires ~8.9GB GPU memory. Larger resolutions or tiling may require 10GB+.
  • Demos: Online demos are available on Hugging Face, OpenXLab, and Replicate. A Colab demo is also provided.

Highlighted Details

  • Accepted by IJCV 2024.
  • Supports StableSR with SD-Turbo for faster inference.
  • Offers ComfyUI integration for workflow management.
  • Includes training scripts for CFW and FaceSR.
  • Provides test sets for easy comparison with paper results.

Maintenance & Community

The project is actively maintained, with recent updates including SD-Turbo support and ComfyUI integration. Links to demos on Hugging Face, OpenXLab, and Replicate are provided. Contact is available via email.

Licensing & Compatibility

Licensed under NTU S-Lab License 1.0. Redistribution and use must follow this license.

Limitations & Caveats

Testing on arbitrary sizes without tiling requires over 10GB GPU memory. Tiled inference for arbitrary sizes requires at least 18GB, with potential border artifacts depending on tile size and stride. FaceSR requires pre-generated reference images from models like CodeFormer.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
97 stars in the last 90 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
3 more.

guided-diffusion by openai

0.2%
7k
Image synthesis codebase for diffusion models
created 4 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
12 more.

stablediffusion by Stability-AI

0.1%
41k
Latent diffusion model for high-resolution image synthesis
created 2 years ago
updated 1 month ago
Feedback? Help us improve.