Research paper for one-step image-to-image translation using Stable Diffusion Turbo
Top 21.5% on sourcepulse
This repository provides a method for adapting single-step diffusion models like SD-Turbo to various image-to-image translation tasks using adversarial learning. It targets researchers and developers looking for efficient, high-quality image translation with applications in sketch-to-image, day-to-night conversion, and more, offering near real-time inference speeds.
How It Works
The approach integrates three modules from latent diffusion models into a single, end-to-end network with minimal trainable weights, primarily using LoRA adapters. This architecture allows for image translation while preserving structural information from the input. By retraining the first layer of the U-Net and incorporating skip connections and Zero-Convs, the model achieves efficient, one-step inference for both paired (pix2pix-turbo) and unpaired (CycleGAN-Turbo) translation tasks.
Quick Start & Requirements
conda env create -f environment.yaml
and conda activate img2img-turbo
or pip install -r requirements.txt
within a virtual environment.gradio_sketch2image.py
and gradio_canny2image.py
.Highlighted Details
Maintenance & Community
The project is associated with authors from CMU and Adobe. Links to community channels or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
The project uses Stable Diffusion-Turbo as a base model, inheriting its license. The README does not specify a unique license for the adapted code itself, but the underlying SD-Turbo model is typically released under a permissive license allowing commercial use.
Limitations & Caveats
The README focuses on successful applications and does not detail known limitations, unsupported tasks, or potential issues with the adversarial training process. The training section links to external steps rather than providing them directly.
1 day ago
1+ week