Auto1111SDK  by Auto1111SDK

Python SDK for Stable Diffusion inference

created 1 year ago
410 stars

Top 72.3% on sourcepulse

GitHubView on GitHub
Project Summary

This Python library provides a lightweight SDK for interacting with the Automatic 1111 Stable Diffusion Web UI's core functionalities. It targets developers and researchers who need to programmatically generate, upscale, and edit images using state-of-the-art diffusion models, offering a modular and efficient alternative to direct Web UI interaction or other libraries.

How It Works

The SDK encapsulates the Automatic 1111 Web UI's pipelines, including Text-to-Image, Image-to-Image, Inpainting, and Outpainting. It utilizes a single pipeline object to support multiple operations, aiming to reduce RAM consumption compared to solutions requiring separate pipeline instantiations. The library also integrates direct model downloading from Civit AI and supports various upscaling models like Esrgan and Real Esrgan.

Quick Start & Requirements

  • Install via pip: pip3 install auto1111sdk
  • For the latest version with ControlNet: pip3 install git+https://github.com/saketh12/Auto1111SDK.git
  • Requires Python. Conda environments are not yet supported.
  • A Colab demo is available for trying out operations.
  • Detailed documentation and comparisons with Huggingface Diffusers are provided.

Highlighted Details

  • Supports Text-to-Image, Image-to-Image, Inpainting, Outpainting, and Stable Diffusion Upscale pipelines.
  • Integrates Esrgan and Real Esrgan upscalers and allows direct model downloads from Civit AI.
  • Features advanced prompt syntax for attention weighting and Composable Diffusion for multiple prompts with weights.
  • Offers a workaround for the 77-token prompt limit found in Huggingface Diffusers.

Maintenance & Community

The project welcomes community contributions, including bug reports and feature requests. Contributions can be made via GitHub issues and pull requests.

Licensing & Compatibility

The library's licensing is not explicitly stated in the README, but it heavily relies on and integrates with the Automatic 1111 Stable Diffusion Web UI, which is typically under a permissive license. Compatibility for commercial use or closed-source linking would depend on the underlying licenses of its dependencies.

Limitations & Caveats

ControlNet currently only supports fp32 precision, with fp16 support planned. The roadmap indicates planned additions for Hires Fix, Refiner, LoRAs, Face restoration, and Dreambooth training scripts, suggesting these features are not yet fully implemented.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.