Discover and explore top open-source AI tools and projects—updated daily.
Diffusion model for video super-resolution
Top 94.6% on SourcePulse
DLoRAL offers a one-step diffusion model for video super-resolution, focusing on detail enhancement and temporal consistency. It is designed for researchers and practitioners in computer vision and video processing who need to upscale low-quality videos.
How It Works
The framework employs a dynamic dual-stage training scheme. The consistency stage optimizes temporal coherence, while the enhancement stage refines spatial details. This is achieved through smooth loss interpolation for stable training. During inference, both C-LoRA and D-LoRA modules are merged into a frozen diffusion UNet, enabling a single-step enhancement process.
Quick Start & Requirements
conda create -n DLoRAL python=3.10
), activate it (conda activate DLoRAL
), and install dependencies (pip install -r requirements.txt
, pip install openmim
, mim install mmcv-full
, pip install mmedit
).mmcv-full
), PyTorch. Specific model weights (RAM, DAPE, and pretrained checkpoints) need to be downloaded and placed in the preset/models/
directory.python src/test_DLoRAL.py --pretrained_model_path stabilityai/stable-diffusion-2-1-base --ram_ft_path /path/to/DLoRAL/preset/models/DAPE.pth --ram_path '/path/to/DLoRAL/preset/models/ram_swin_large_14m.pth' --pretrained_path /path/to/DLoRAL/preset/models/checkpoints/model.pkl -i /path/to/input_videos/ -o /path/to/results
Highlighted Details
Maintenance & Community
The project is associated with The Hong Kong Polytechnic University and OPPO Research Institute. Contact: yujingsun1999@gmail.com.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Training data is not yet released. The project is recent, and extensive community adoption or long-term maintenance is yet to be established.
1 week ago
Inactive