SD-Latent-Interposer  by city96

Neural network for Stable Diffusion latent space interoperability

created 2 years ago
296 stars

Top 90.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a neural network-based ComfyUI custom node that enables direct interoperability between latent spaces of different Stable Diffusion models, bypassing the need for VAE re-encoding. It targets users of Stable Diffusion who want to leverage latents from newer models (like SDXL, SD3, Flux.1, Stable Cascade) with older architectures (SDv1.x) or vice-versa, offering a more streamlined workflow and potentially preserving finer details.

How It Works

The interposer utilizes a small neural network, trained to map latents from one Stable Diffusion model's latent space to another. This approach avoids the lossy VAE decode/encode cycle, aiming to preserve image fidelity and composition. The training process involves minimizing multiple loss functions, including direct latent reconstruction (p_loss, b_loss) and round-trip consistency (r_loss, h_loss), to ensure accurate transformations between different latent representations.

Quick Start & Requirements

  • Install by cloning the repo into custom_nodes/SD-Latent-Interposer or placing comfy_latent_interposer.py in ComfyUI/custom_nodes/.
  • Requires huggingface-hub (pip install huggingface-hub).
  • Models are downloaded from Hugging Face by default; local models can be placed in custom_nodes/SD-Latent-Interposer/models.
  • Official documentation and examples are available within the repository.

Highlighted Details

  • Supports interoperability between SDv1.x, SDXL, SDv3, Flux.1, and Stable Cascade (Stage A/B).
  • Offers pre-trained models for various conversion directions (e.g., v1 to XL, XL to v1, v3 to v1, etc.).
  • Training code is provided, allowing for custom model training with a specified dataset format.
  • Version 4.0 models are available, with previous versions (v3.1, v1.1) also documented.

Maintenance & Community

  • The project is maintained by city96.
  • Model weights are hosted on Hugging Face.
  • No specific community channels (Discord/Slack) or roadmap are explicitly mentioned in the README.

Licensing & Compatibility

  • The repository is licensed under the Apache 2.0 license.
  • This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

  • Some conversion directions (e.g., to Flux.1, to Stable Cascade) are marked as "No" in the compatibility table.
  • The README notes that color/hue/saturation shifts can still be an issue with certain conversions, and artifacts may persist.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.