ml-stable-diffusion by apple

Core ML Stable Diffusion for Apple Silicon devices

Created 3 years ago

17,777 stars

Top 2.6% on SourcePulse

View on GitHub

16 Experts Love This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Luis Capelo

Cofounder of Lightning AI

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Jasper Zhang

Cofounder of Hyperbolic

and 12 more!

Project Summary

This repository provides tools and Swift packages for running Stable Diffusion image generation models efficiently on Apple Silicon hardware using Core ML. It targets developers building macOS, iOS, and iPadOS applications who want to integrate advanced image generation capabilities, offering optimized performance and reduced memory footprint through Core ML's capabilities.

How It Works

The project leverages coremltools to convert PyTorch Stable Diffusion models into the Core ML format. This conversion process includes optimizations like weight quantization (down to 6-bit and even lower with Mixed-Bit Palettization) and attention mechanism implementations (SPLIT_EINSUM, SPLIT_EINSUM_V2) tailored for Apple's Neural Engine and GPUs. The resulting .mlpackage or .mlmodelc files can then be directly integrated into Xcode projects via a Swift package for on-device inference.

Quick Start & Requirements

Model Conversion: Requires macOS 13.1+, Python 3.8, and coremltools 7.0.
Project Build: Requires macOS 13.1+, Xcode 14.3+, and Swift 5.8.
Target Devices: macOS 13.1+, iPadOS/iOS 16.2+. For memory improvements, macOS 14.0+ / iPadOS/iOS 17.0+.
Hardware: Minimum M1 (Mac), M1 (iPad), A14 (iPhone).
Installation: Clone the repository, set up a Python environment (conda activate coreml_stable_diffusion, pip install -e .), and log in to Hugging Face CLI. Model conversion is done via python -m python_coreml_stable_diffusion.torch2coreml ....
Swift Integration: Add the StableDiffusion Swift package to Xcode projects.
Resources: Core ML Tools Docs, WWDC23 Session.

Highlighted Details

Supports various Stable Diffusion versions (v1.4, v1.5, v2.1, XL, v3) with optimized Core ML models available on Hugging Face Hub.
Advanced compression techniques like Mixed-Bit Palettization (MBP) and Activation Quantization (W8A8) significantly reduce model size and improve inference speed on Apple Silicon.
Includes support for ControlNet and multilingual text encoders using Apple's NaturalLanguage framework.
Provides detailed performance benchmarks across different Apple devices, demonstrating latency and diffusion speed improvements with various optimization techniques.

Maintenance & Community

Developed by Apple, with contributions from Hugging Face and the community.
Active development indicated by support for newer models like Stable Diffusion 3.
Hugging Face Diffusers App serves as a demo and reference implementation.

Licensing & Compatibility

The repository itself is licensed under the Apache License 2.0.
Core ML models are typically distributed under licenses from their original creators (e.g., Stability AI, CompVis), which should be reviewed for commercial use.

Limitations & Caveats

Model conversion can be memory-intensive, potentially requiring workarounds on systems with limited RAM (e.g., 8GB).
While optimizations aim for high fidelity, minor differences in generated images compared to PyTorch are possible due to floating-point precision and RNG variations.
Some advanced features like ControlNet for SDXL are not yet supported.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

71 stars in the last 30 days