img2img-turbo  by GaParmar

Research paper for one-step image-to-image translation using Stable Diffusion Turbo

created 1 year ago
2,146 stars

Top 21.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a method for adapting single-step diffusion models like SD-Turbo to various image-to-image translation tasks using adversarial learning. It targets researchers and developers looking for efficient, high-quality image translation with applications in sketch-to-image, day-to-night conversion, and more, offering near real-time inference speeds.

How It Works

The approach integrates three modules from latent diffusion models into a single, end-to-end network with minimal trainable weights, primarily using LoRA adapters. This architecture allows for image translation while preserving structural information from the input. By retraining the first layer of the U-Net and incorporating skip connections and Zero-Convs, the model achieves efficient, one-step inference for both paired (pix2pix-turbo) and unpaired (CycleGAN-Turbo) translation tasks.

Quick Start & Requirements

  • Install: conda env create -f environment.yaml and conda activate img2img-turbo or pip install -r requirements.txt within a virtual environment.
  • Prerequisites: Python, Conda. Requires a GPU for inference.
  • Demo: Gradio demos available for gradio_sketch2image.py and gradio_canny2image.py.
  • Docs: Paper

Highlighted Details

  • Achieves 0.29s inference on A6000 and 0.11s on A100 for 512x512 images.
  • CycleGAN-Turbo outperforms existing GAN and diffusion methods for unpaired translation.
  • pix2pix-turbo matches ControlNet performance for paired tasks with one-step inference.
  • Supports diverse output generation by varying input noise maps and controlling style via text prompts.

Maintenance & Community

The project is associated with authors from CMU and Adobe. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The project uses Stable Diffusion-Turbo as a base model, inheriting its license. The README does not specify a unique license for the adapted code itself, but the underlying SD-Turbo model is typically released under a permissive license allowing commercial use.

Limitations & Caveats

The README focuses on successful applications and does not detail known limitations, unsupported tasks, or potential issues with the adversarial training process. The training section links to external steps rather than providing them directly.

Health Check
Last commit

1 day ago

Responsiveness

1+ week

Pull Requests (30d)
5
Issues (30d)
3
Star History
153 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
3 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
created 2 years ago
updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
3 more.

guided-diffusion by openai

0.2%
7k
Image synthesis codebase for diffusion models
created 4 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
4 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
created 4 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
12 more.

stablediffusion by Stability-AI

0.1%
41k
Latent diffusion model for high-resolution image synthesis
created 2 years ago
updated 1 month ago
Feedback? Help us improve.