qwen-image-mps by ivanfioravanti

AI-powered image generation and editing CLI tool

Created 7 months ago

261 stars

Top 97.4% on SourcePulse

Project Summary

This project provides a command-line interface and an experimental Gradio UI for generating and editing images using the Qwen-Image models. It targets engineers and power users seeking efficient image manipulation, particularly on Apple Silicon (MPS), NVIDIA CUDA, or CPU, offering optimized performance through automatic device selection and fast inference modes.

How It Works

The tool leverages Hugging Face's Diffusers library, employing Qwen/Qwen-Image-2512 for generation and Qwen/Qwen-Image-Edit-2511 (via QwenImageEditPlusPipeline) with the linoyts/Qwen-Image-Edit-Rapid-AIO transformer for editing. It automatically selects the optimal compute device, prioritizing Apple Silicon's MPS (bfloat16), then NVIDIA CUDA (bfloat16), falling back to CPU (float32). Image generation utilizes Lightning LoRA models for accelerated 8-step ("Fast") or 4-step ("Ultra-Fast") inference, while editing employs the Rapid-AIO transformer for optimized 4-step processing.

Quick Start & Requirements

Primary Install: pip install qwen-image-mps
Alternative Install: Use uv run with the script URL or install from source (git clone ..., pip install -e .).
Prerequisites: Python environment. The first run automatically downloads large Hugging Face models. For optimal performance on Apple Silicon, ensure PyTorch includes MPS support and run using native Apple Silicon Python (not under Rosetta).
Links: GitHub Repository: https://github.com/ivanfioravanti/qwen-image-mps.git

Highlighted Details

Device Agnostic: Automatic device selection prioritizes MPS, then CUDA, then CPU.
Accelerated Modes: "Fast" (8-step generation) and "Ultra-Fast" (4-step generation) modes leverage Lightning LoRA; editing uses Rapid-AIO transformer for 4-step processing.
Versatile Functionality: Supports text-to-image generation, image editing, multi-image operations, custom LoRA integration, and negative prompts.
Unique Features: Includes a "Batman mode" for LEGO Batman photobombs and a "Photo-to-Anime" mode for style transformation.
Gradio UI: An experimental graphical interface (qwen-image-mps-gradio) is available.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or project roadmap were found in the provided README content.

Licensing & Compatibility

The license type is not explicitly stated in the provided README content. Therefore, compatibility for commercial use or closed-source linking cannot be determined from this information.

Limitations & Caveats

MPS device usage requires a compatible PyTorch build and native Apple Silicon Python execution. The Gradio UI is explicitly marked as experimental. License details are absent, preventing assessment of usage restrictions.

qwen-image-mps by ivanfioravanti

Explore Similar Projects

OmniGen2 by VectorSpaceLab

InstructDiffusion by cientgu

UltraPixel by catcathh

HiDream-E1 by HiDream-ai

dream-factory by rbbrdckybk

flymyai-lora-trainer by FlyMyAI

NanoBananaEditor by markfulton

aice_ps by aigem

OmniGen by VectorSpaceLab

SwarmUI by mcmonkeyprojects

IP-Adapter by tencent-ailab

VQGAN-CLIP by nerdyrodent