SeeSR  by cswry

Real-world image super-resolution research paper (CVPR 2024)

created 1 year ago
566 stars

Top 57.7% on sourcepulse

GitHubView on GitHub
Project Summary

SeeSR addresses real-world image super-resolution by incorporating semantic awareness into the process, aiming to produce higher-quality results than traditional methods. It is targeted at researchers and practitioners in computer vision and image processing who need to enhance low-resolution images with semantic understanding.

How It Works

SeeSR leverages a diffusion model (Stable Diffusion 2 base) fine-tuned for super-resolution. It integrates a novel component called DAPE (Diffusion-based Adaptive Perceptual Enhancement) to improve perceptual quality. The approach uses semantic information to guide the super-resolution process, leading to more contextually appropriate and detailed outputs.

Quick Start & Requirements

  • Installation: Clone the repository, create a Python 3.8 environment, and install requirements: pip install -r requirements.txt.
  • Prerequisites: Python >= 3.8, PyTorch, Hugging Face diffusers, BasicSR. Pretrained models for Stable Diffusion 2 base, SeeSR, and DAPE are required.
  • Inference: Download models, place test images in preset/datasets/test_datasets, and run python test_seesr.py with specified model paths and parameters. A turbo mode (test_seesr_turbo.py) with 2 inference steps is also available.
  • Demo: A Gradio demo is provided via python gradio_seesr.py.
  • Resources: Requires significant GPU memory, with a tiled VAE method suggested for optimization.

Highlighted Details

  • Accepted by CVPR2024.
  • Offers a "turbo" mode for faster inference (2 steps).
  • Integrated into Replicate for online testing.
  • Released a dataset RealLR200 for real-world low-resolution images.
  • Training scripts for both DAPE and SeeSR are provided.

Maintenance & Community

The project is actively updated, with recent news including the release of an OSEDiff model for faster results. Community interaction points are not explicitly listed, but contact information for the primary author is provided.

Licensing & Compatibility

The project and its weights are released under the Apache 2.0 license, which generally permits commercial use and linking with closed-source projects.

Limitations & Caveats

The README mentions ongoing work for SeeSR-SDXL and face/text specific models, suggesting current versions may not be optimized for all specific content types. Training requires substantial data preparation and computational resources.

Health Check
Last commit

9 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
38 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
3 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
created 2 years ago
updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
3 more.

guided-diffusion by openai

0.2%
7k
Image synthesis codebase for diffusion models
created 4 years ago
updated 1 year ago
Feedback? Help us improve.