Visual-Style-Prompting  by naver-ai

Text-to-image research paper for stylized generation via visual style prompting

created 1 year ago
458 stars

Top 67.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for "Visual Style Prompting with Swapping Self-Attention," a method for text-to-image generation that allows users to control image style without fine-tuning diffusion models. It targets researchers and developers in AI image generation seeking consistent style transfer and reduced content leakage.

How It Works

The core innovation lies in a training-free approach that manipulates self-attention layers during the diffusion model's denoising process. Specifically, it swaps the key and value components from reference style features with the query from original features in the later self-attention layers. This technique effectively injects the desired visual style while preserving the content specified by the text prompt, avoiding the need for costly model fine-tuning.

Quick Start & Requirements

  • Install via pip: pip install --upgrade diffusers accelerate transformers einops kornia gradio triton xformers==0.0.16
  • Requires PyTorch 1.13.1.
  • Official HuggingFace demos are available: Default, with ControlNet.

Highlighted Details

  • Enables training-free visual style prompting.
  • Achieves superior style reflection and text prompt adherence compared to existing methods.
  • Supports integration with ControlNet for edge and depth conditioning.
  • Includes a script for real image style transfer with optional color calibration.

Maintenance & Community

The project is developed by NAVER AI Lab and Yonsei University. Links to community resources are not explicitly provided in the README.

Licensing & Compatibility

Licensed under the Apache License, Version 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The implementation requires specific versions of PyTorch and xformers. While the README mentions "color calibration to use a real image as reference" as a to-do, the vsp_real_script.py script appears to implement this functionality.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.