latentblending  by lunarring

Video tool for smooth Stable Diffusion prompt transitions

created 2 years ago
363 stars

Top 78.5% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a Python library for generating high-quality, smooth video transitions between text prompts using Stable Diffusion XL. It's designed for researchers and artists looking to create seamless visual narratives or dynamic animations with minimal computational cost and high user control.

How It Works

Latent blending manipulates intermediate latent representations within the diffusion process. It constructs a "diffusion tree" where branches are dynamically added based on perceptual similarity (LPIPS) to ensure smooth transitions. Key features include cross-feeding latents between branches to preserve structure and a novel approach to suppress motion artifacts, tricking the visual system into perceiving a single continuous image. SDXL Turbo support enables transitions faster than playback.

Quick Start & Requirements

  • Install: pip install git+https://github.com/lunarring/latentblending
  • Requirements: Python, PyTorch, CUDA (GPU required for reasonable performance).
  • Optional Speedup: stable-fast compilation (do_compile=True).
  • UI: python latentblending/gradio_ui.py
  • Examples: single_trans.py, multi_trans.py

Highlighted Details

  • Supports SDXL and SDXL Turbo for fast transitions.
  • Dynamic branch injection using LPIPS for optimal smoothness.
  • Customizable parameters: guidance scale, diffusion steps, cross-feeding power, range, and decay.
  • Gradio UI for iterative keyframe generation and movie stitching.
  • Time-based computation allows specifying compute budget instead of frame count.

Maintenance & Community

  • Primary contact: Johannes Stelzer (@j_stelzer on Twitter).
  • Future plans include macOS support, Huggingface Spaces integration, and ControlNet/IP-Adapter.

Licensing & Compatibility

  • License: Not explicitly stated in the README. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

  • macOS support is listed as "coming soon."
  • Inpaint support has been dropped.
  • The exact license is not specified, which may impact commercial adoption.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify) and Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers).

taesd by madebyollin

0.5%
758
Tiny AutoEncoder for Stable Diffusion latents
created 2 years ago
updated 3 months ago
Feedback? Help us improve.