text2image-gui  by n00mkrad

GUI for Stable Diffusion text-to-image generation

Created 3 years ago
962 stars

Top 38.3% on SourcePulse

GitHubView on GitHub
Project Summary

This GUI provides a user-friendly interface for text-to-image generation, primarily targeting Stable Diffusion. It offers advanced prompt engineering features, model management, and image post-processing, catering to both novice users and power users seeking fine-grained control over AI image creation.

How It Works

The GUI leverages a customized fork of InvokeAI's Stable Diffusion codebase, allowing for modularity and feature expansion. It supports multiple Stable Diffusion implementations (InvokeAI, ONNX) and other models like InstructPix2Pix. Key features include advanced prompt syntax for emphasis and wildcards, LoRA and Textual Inversion embedding integration, and img2img/inpainting capabilities.

Quick Start & Requirements

  • Install: Download and run the executable.
  • Prerequisites: Windows 10/11 64-bit.
    • Minimum: NVIDIA GPU (Maxwell+) with 4GB VRAM, or DirectML GPU with 8GB VRAM. 8GB RAM (pagefile enabled). 10GB disk space.
    • Recommended: NVIDIA GPU (Pascal+) with 8GB VRAM. 16GB RAM. 12GB disk space on SSD.
  • Docs: System Requirements, Main Guide

Highlighted Details

  • Supports advanced prompt syntax: multiple prompts, negative prompts, emphasis (+/-), and wildcards (inline or file-based).
  • Integrated model management for Stable Diffusion checkpoints, VAEs, Textual Inversions, and LoRAs.
  • Post-processing options include upscaling (RealESRGAN) and face restoration (GFPGAN, CodeFormer).
  • Developer tools for model merging, pruning, and format conversion.

Maintenance & Community

Licensing & Compatibility

  • The project appears to be distributed under a permissive license, but the specific license is not explicitly stated in the README. Compatibility for commercial use should be verified.

Limitations & Caveats

  • AMD GPU support is noted as having "limited feature support" via ONNX.
  • The README mentions that some options might be hidden depending on the selected implementation, suggesting potential feature inconsistencies.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Robin Rombach Robin Rombach(Cofounder of Black Forest Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

Kandinsky-2 by ai-forever

0.0%
3k
Multilingual text-to-image latent diffusion model
Created 2 years ago
Updated 1 year ago
Starred by Deepak Pathak Deepak Pathak(Cofounder of Skild AI; Professor at CMU), Travis Fischer Travis Fischer(Founder of Agentic), and
8 more.

sygil-webui by Sygil-Dev

0.0%
8k
Web UI for Stable Diffusion
Created 3 years ago
Updated 2 months ago
Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
57 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.