ml-stable-diffusion  by apple

Core ML Stable Diffusion for Apple Silicon devices

created 2 years ago
17,511 stars

Top 2.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides tools and Swift packages for running Stable Diffusion image generation models efficiently on Apple Silicon hardware using Core ML. It targets developers building macOS, iOS, and iPadOS applications who want to integrate advanced image generation capabilities, offering optimized performance and reduced memory footprint through Core ML's capabilities.

How It Works

The project leverages coremltools to convert PyTorch Stable Diffusion models into the Core ML format. This conversion process includes optimizations like weight quantization (down to 6-bit and even lower with Mixed-Bit Palettization) and attention mechanism implementations (SPLIT_EINSUM, SPLIT_EINSUM_V2) tailored for Apple's Neural Engine and GPUs. The resulting .mlpackage or .mlmodelc files can then be directly integrated into Xcode projects via a Swift package for on-device inference.

Quick Start & Requirements

  • Model Conversion: Requires macOS 13.1+, Python 3.8, and coremltools 7.0.
  • Project Build: Requires macOS 13.1+, Xcode 14.3+, and Swift 5.8.
  • Target Devices: macOS 13.1+, iPadOS/iOS 16.2+. For memory improvements, macOS 14.0+ / iPadOS/iOS 17.0+.
  • Hardware: Minimum M1 (Mac), M1 (iPad), A14 (iPhone).
  • Installation: Clone the repository, set up a Python environment (conda activate coreml_stable_diffusion, pip install -e .), and log in to Hugging Face CLI. Model conversion is done via python -m python_coreml_stable_diffusion.torch2coreml ....
  • Swift Integration: Add the StableDiffusion Swift package to Xcode projects.
  • Resources: Core ML Tools Docs, WWDC23 Session.

Highlighted Details

  • Supports various Stable Diffusion versions (v1.4, v1.5, v2.1, XL, v3) with optimized Core ML models available on Hugging Face Hub.
  • Advanced compression techniques like Mixed-Bit Palettization (MBP) and Activation Quantization (W8A8) significantly reduce model size and improve inference speed on Apple Silicon.
  • Includes support for ControlNet and multilingual text encoders using Apple's NaturalLanguage framework.
  • Provides detailed performance benchmarks across different Apple devices, demonstrating latency and diffusion speed improvements with various optimization techniques.

Maintenance & Community

  • Developed by Apple, with contributions from Hugging Face and the community.
  • Active development indicated by support for newer models like Stable Diffusion 3.
  • Hugging Face Diffusers App serves as a demo and reference implementation.

Licensing & Compatibility

  • The repository itself is licensed under the Apache License 2.0.
  • Core ML models are typically distributed under licenses from their original creators (e.g., Stability AI, CompVis), which should be reviewed for commercial use.

Limitations & Caveats

  • Model conversion can be memory-intensive, potentially requiring workarounds on systems with limited RAM (e.g., 8GB).
  • While optimizations aim for high fidelity, minor differences in generated images compared to PyTorch are possible due to floating-point precision and RNG variations.
  • Some advanced features like ControlNet for SDXL are not yet supported.
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
294 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.