latentblending by lunarring

Video tool for smooth Stable Diffusion prompt transitions

Created 3 years ago

366 stars

Top 77.0% on SourcePulse

1 Expert Loves This Project

transitive-bullshit

Founder of Agentic

Project Summary

This repository provides a Python library for generating high-quality, smooth video transitions between text prompts using Stable Diffusion XL. It's designed for researchers and artists looking to create seamless visual narratives or dynamic animations with minimal computational cost and high user control.

How It Works

Latent blending manipulates intermediate latent representations within the diffusion process. It constructs a "diffusion tree" where branches are dynamically added based on perceptual similarity (LPIPS) to ensure smooth transitions. Key features include cross-feeding latents between branches to preserve structure and a novel approach to suppress motion artifacts, tricking the visual system into perceiving a single continuous image. SDXL Turbo support enables transitions faster than playback.

Quick Start & Requirements

Install: pip install git+https://github.com/lunarring/latentblending
Requirements: Python, PyTorch, CUDA (GPU required for reasonable performance).
Optional Speedup: stable-fast compilation (do_compile=True).
UI: python latentblending/gradio_ui.py
Examples: single_trans.py, multi_trans.py

Highlighted Details

Supports SDXL and SDXL Turbo for fast transitions.
Dynamic branch injection using LPIPS for optimal smoothness.
Customizable parameters: guidance scale, diffusion steps, cross-feeding power, range, and decay.
Gradio UI for iterative keyframe generation and movie stitching.
Time-based computation allows specifying compute budget instead of frame count.

Maintenance & Community

Primary contact: Johannes Stelzer (@j_stelzer on Twitter).
Future plans include macOS support, Huggingface Spaces integration, and ControlNet/IP-Adapter.

Licensing & Compatibility

License: Not explicitly stated in the README. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

macOS support is listed as "coming soon."
Inpaint support has been dropped.
The exact license is not specified, which may impact commercial adoption.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

stable-diffusion-webui-prompt-travel by Kahsolt

SD WebUI extension for latent-space prompt interpolation to create pseudo-animations

Created 3 years ago

Updated 1 year ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind) and

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral).

mixture-of-diffusers by albarji

Image generation method for scene composition using multiple diffusion processes

Created 3 years ago

Updated 2 years ago

kandinsky-5 by kandinskylab

Advanced diffusion models for versatile video and image generation

Created 5 months ago

Updated 1 week ago

Disco-Stable-Diffusion-Win-GUI by zhaoyun0071

Windows GUI for Stable Diffusion

Created 3 years ago

Updated 2 years ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

SEINE by Vchitect

Video diffusion model for generative transition and prediction (ICLR 2024 paper)

Created 2 years ago

Updated 1 year ago

Real-Time-Latent-Consistency-Model by radames

App for real-time diffusion model pipelines using Diffusers

Created 2 years ago

Updated 3 months ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI).

animatediff-cli by neggles

CLI tool for AnimateDiff stable diffusion generation

Created 2 years ago

Updated 2 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Zhiqiang Xie

Zhiqiang Xie(Coauthor of SGLang), and

1 more.

Sana by NVlabs

Image synthesis research paper using a linear diffusion transformer

Created 1 year ago

Updated 3 weeks ago

Starred by

Chenlin Meng

Chenlin Meng(Cofounder of Pika),

Yoland Yan

Yoland Yan(Cofounder of Comfy Org), and

2 more.

Tune-A-Video by showlab

Text-to-video generation via diffusion model fine-tuning

Created 3 years ago

Updated 2 years ago

Starred by

Chenlin Meng

Chenlin Meng(Cofounder of Pika),

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and

4 more.

stable-diffusion-videos by nateraw

Video generator for Stable Diffusion latent space exploration

Created 3 years ago

Updated 3 weeks ago

ComfyUI-WanVideoWrapper by kijai

ComfyUI nodes for advanced video generation

Created 10 months ago

Updated 3 days ago

Starred by

Gabriel Almeida

Gabriel Almeida(Cofounder of Langflow),

Alex Yu

Alex Yu(Research Scientist at OpenAI; Cofounder of Luma AI), and

2 more.

LTX-Video by Lightricks

DiT-based video generation model for high-quality, real-time video creation

Created 1 year ago

Updated 5 days ago

Feedback? Help us improve.