DLoRAL  by yjsunnn

Diffusion model for video super-resolution

Created 3 months ago
273 stars

Top 94.6% on SourcePulse

GitHubView on GitHub
Project Summary

DLoRAL offers a one-step diffusion model for video super-resolution, focusing on detail enhancement and temporal consistency. It is designed for researchers and practitioners in computer vision and video processing who need to upscale low-quality videos.

How It Works

The framework employs a dynamic dual-stage training scheme. The consistency stage optimizes temporal coherence, while the enhancement stage refines spatial details. This is achieved through smooth loss interpolation for stable training. During inference, both C-LoRA and D-LoRA modules are merged into a frozen diffusion UNet, enabling a single-step enhancement process.

Quick Start & Requirements

  • Installation: Clone the repository, create a Conda environment (conda create -n DLoRAL python=3.10), activate it (conda activate DLoRAL), and install dependencies (pip install -r requirements.txt, pip install openmim, mim install mmcv-full, pip install mmedit).
  • Prerequisites: Python 3.10, CUDA (implied by mmcv-full), PyTorch. Specific model weights (RAM, DAPE, and pretrained checkpoints) need to be downloaded and placed in the preset/models/ directory.
  • Inference: python src/test_DLoRAL.py --pretrained_model_path stabilityai/stable-diffusion-2-1-base --ram_ft_path /path/to/DLoRAL/preset/models/DAPE.pth --ram_path '/path/to/DLoRAL/preset/models/ram_swin_large_14m.pth' --pretrained_path /path/to/DLoRAL/preset/models/checkpoints/model.pkl -i /path/to/input_videos/ -o /path/to/results
  • Colab Demo: Available at https://colab.research.google.com/github/yjsunnn/DLoRAL/blob/main/notebooks/inference_dloral.ipynb

Highlighted Details

  • Official implementation of "One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution."
  • Achieves detail-rich and temporally consistent video upscaling.
  • Merges C-LoRA and D-LoRA into a frozen diffusion UNet for one-step inference.
  • Training code is available, with a TODO for releasing training data.

Maintenance & Community

The project is associated with The Hong Kong Polytechnic University and OPPO Research Institute. Contact: yujingsun1999@gmail.com.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Training data is not yet released. The project is recent, and extensive community adoption or long-term maintenance is yet to be established.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
17 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.