SeeSR by cswry

Real-world image super-resolution research paper (CVPR 2024)

Created 2 years ago

605 stars

Top 54.1% on SourcePulse

Project Summary

SeeSR addresses real-world image super-resolution by incorporating semantic awareness into the process, aiming to produce higher-quality results than traditional methods. It is targeted at researchers and practitioners in computer vision and image processing who need to enhance low-resolution images with semantic understanding.

How It Works

SeeSR leverages a diffusion model (Stable Diffusion 2 base) fine-tuned for super-resolution. It integrates a novel component called DAPE (Diffusion-based Adaptive Perceptual Enhancement) to improve perceptual quality. The approach uses semantic information to guide the super-resolution process, leading to more contextually appropriate and detailed outputs.

Quick Start & Requirements

Installation: Clone the repository, create a Python 3.8 environment, and install requirements: pip install -r requirements.txt.
Prerequisites: Python >= 3.8, PyTorch, Hugging Face diffusers, BasicSR. Pretrained models for Stable Diffusion 2 base, SeeSR, and DAPE are required.
Inference: Download models, place test images in preset/datasets/test_datasets, and run python test_seesr.py with specified model paths and parameters. A turbo mode (test_seesr_turbo.py) with 2 inference steps is also available.
Demo: A Gradio demo is provided via python gradio_seesr.py.
Resources: Requires significant GPU memory, with a tiled VAE method suggested for optimization.

Highlighted Details

Accepted by CVPR2024.
Offers a "turbo" mode for faster inference (2 steps).
Integrated into Replicate for online testing.
Released a dataset RealLR200 for real-world low-resolution images.
Training scripts for both DAPE and SeeSR are provided.

Maintenance & Community

The project is actively updated, with recent news including the release of an OSEDiff model for faster results. Community interaction points are not explicitly listed, but contact information for the primary author is provided.

Licensing & Compatibility

The project and its weights are released under the Apache 2.0 license, which generally permits commercial use and linking with closed-source projects.

Limitations & Caveats

The README mentions ongoing work for SeeSR-SDXL and face/text specific models, suggesting current versions may not be optimized for all specific content types. Training requires substantial data preparation and computational resources.

SeeSR by cswry

Explore Similar Projects

LinFusion by Huage001

sd-webui-bmab by portu-sim

karlo by kakaobrain

sd-webui-stablesr by pkuliyi2015

Lumina-Image-2.0 by Alpha-VLLM

HYPIR by XPixelGroup

StableSR by IceClear

DemoFusion by PRIS-CV

ComfyUI-Impact-Pack by ltdrdata

SUPIR by Fanghua-Yu

guided-diffusion by openai

upscayl by upscayl