CCSR by csslc

Research paper for content-consistent super-resolution via diffusion models

Created 2 years ago

586 stars

Top 55.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Forrest Iandola

Author of SqueezeNet; Research Scientist at Meta

Project Summary

This repository provides the official implementation for CCSR (Content Consistent Super-Resolution), a diffusion model-based approach to image super-resolution. It addresses the instability and efficiency issues found in existing diffusion-based super-resolution methods, offering enhanced clarity and stable results for researchers and practitioners in image processing and computer vision.

How It Works

CCSR employs a two-stage diffusion process. Stage 1 utilizes a ControlNet-like architecture to condition the diffusion process on the low-resolution input, ensuring content consistency. Stage 2 refines the output, with CCSR-v2 streamlining this into a single-stage workflow. This design allows for flexible inference with as few as 1 or 2 diffusion steps, significantly improving efficiency without retraining, while also introducing novel stability metrics (G-STD and L-STD) for quantitative evaluation.

Quick Start & Requirements

Installation: Clone the repository, create a conda environment (conda create -n ccsr python=3.9, conda activate ccsr), and install requirements (pip install -r requirements.txt).
Prerequisites: Python >= 3.9, PyTorch, Diffusers library, Accelerate, xformers (for memory efficiency). Pretrained Stable Diffusion v2.1-base models and custom CCSR models (ControlNet, VAE) are required and can be downloaded from provided links.
Inference: Run test_ccsr_tile.py with specified arguments for one-step or multi-step inference. Options for tiling are available to manage GPU memory.
Resources: Requires significant GPU VRAM, especially for higher resolutions and tiling configurations.
Links: CCSR-v2 Branch, CCSR-v1 Branch, Pretrained Models

Highlighted Details

CCSR-v2 supports flexible inference with as few as 1 or 2 diffusion steps without re-training.
Introduces novel stability metrics: global standard deviation (G-STD) and local standard deviation (L-STD).
Achieves improved stability and enhanced clarity in super-resolution results compared to other diffusion-based methods.
Integrates tiling mechanisms for memory-efficient inference.

Maintenance & Community

The project is associated with researchers from The Hong Kong Polytechnic University and OPPO Research Institute. The primary contact is ling-chen.sun@connect.polyu.hk.

Licensing & Compatibility

Released under the Apache 2.0 license. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The training process requires substantial computational resources and careful setup of training data. While CCSR-v2 offers improved stability, the inherent stochastic nature of diffusion models can still lead to minor variations in output across different runs, though the method aims to minimize this.

CCSR by csslc

Explore Similar Projects

Awesome-DiT-Inference by xlite-dev

hart by mit-han-lab

erasing by rohitgandikota

DiffusionFastForward by mikonvergence

HYPIR by XPixelGroup

mdlm by kuleshov-group

v-diffusion-pytorch by crowsonkb

dpm-solver by LuChengTHU

CatVTON by Zheng-Chong

k-diffusion by crowsonkb

Dreambooth-Stable-Diffusion by XavierXiao

ControlNet by lllyasviel