stable-diffusion-aesthetic-gradients  by vicgalle

Personalization technique for Stable Diffusion research paper

created 2 years ago
731 stars

Top 48.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a method for personalizing Stable Diffusion text-to-image generation by guiding the process towards custom aesthetics defined by user-provided image sets. It targets users seeking to influence image style without extensive prompt engineering or model retraining, offering a way to imbue generated images with specific visual characteristics.

How It Works

The core innovation is "aesthetic gradients," a technique that optimizes the diffusion model's latent space based on an "aesthetic embedding." This embedding is derived from a collection of images representing the desired aesthetic. By adjusting parameters like aesthetic_steps and aesthetic_lr, users can control the degree to which the generation process aligns with this learned aesthetic, effectively steering the output towards a specific visual style.

Quick Start & Requirements

  • Install via pip install -e . (requires a fork of the original Stable Diffusion repository).
  • Prerequisites are identical to the original Stable Diffusion repository.
  • Usage involves scripts/txt2img.py with additional arguments: --aesthetic_steps, --aesthetic_lr, and --aesthetic_embedding.
  • Pre-computed aesthetic embeddings are available in the aesthetic_embeddings directory.
  • A script scripts/gen_aesthetic_embedding.py is provided for generating custom embeddings.

Highlighted Details

  • Enables style personalization without retraining or high computational resources.
  • Demonstrates effectiveness with various Stable Diffusion models and datasets.
  • Allows mixing prompt-based styles with aesthetic embedding styles.
  • Includes pre-computed embeddings for styles like "fantasy," "flower_plant," and artist-specific aesthetics.

Maintenance & Community

The project is associated with the paper "Personalizing Text-to-Image Generation via Aesthetic Gradients" by Victor Gallego. Further resources and examples are linked in the README, including blog posts and discussions on platforms like Bilibili and Zhihu. A pull request for integration into the AUTOMATIC1111 Stable Diffusion Web UI is mentioned.

Licensing & Compatibility

The repository is a fork of the original Stable Diffusion, implying it inherits its licensing. The README does not explicitly state a license for this specific fork, but Stable Diffusion itself is typically released under permissive licenses allowing commercial use.

Limitations & Caveats

The effectiveness of personalization is dependent on the quality and representativeness of the user-provided images for generating custom aesthetic embeddings. The README notes that simply adding an artist's name to the prompt may not yield the same results as using a dedicated aesthetic embedding.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.