qwen-image-mps  by ivanfioravanti

AI-powered image generation and editing CLI tool

Created 5 months ago
256 stars

Top 98.7% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a command-line interface and an experimental Gradio UI for generating and editing images using the Qwen-Image models. It targets engineers and power users seeking efficient image manipulation, particularly on Apple Silicon (MPS), NVIDIA CUDA, or CPU, offering optimized performance through automatic device selection and fast inference modes.

How It Works

The tool leverages Hugging Face's Diffusers library, employing Qwen/Qwen-Image-2512 for generation and Qwen/Qwen-Image-Edit-2511 (via QwenImageEditPlusPipeline) with the linoyts/Qwen-Image-Edit-Rapid-AIO transformer for editing. It automatically selects the optimal compute device, prioritizing Apple Silicon's MPS (bfloat16), then NVIDIA CUDA (bfloat16), falling back to CPU (float32). Image generation utilizes Lightning LoRA models for accelerated 8-step ("Fast") or 4-step ("Ultra-Fast") inference, while editing employs the Rapid-AIO transformer for optimized 4-step processing.

Quick Start & Requirements

  • Primary Install: pip install qwen-image-mps
  • Alternative Install: Use uv run with the script URL or install from source (git clone ..., pip install -e .).
  • Prerequisites: Python environment. The first run automatically downloads large Hugging Face models. For optimal performance on Apple Silicon, ensure PyTorch includes MPS support and run using native Apple Silicon Python (not under Rosetta).
  • Links: GitHub Repository: https://github.com/ivanfioravanti/qwen-image-mps.git

Highlighted Details

  • Device Agnostic: Automatic device selection prioritizes MPS, then CUDA, then CPU.
  • Accelerated Modes: "Fast" (8-step generation) and "Ultra-Fast" (4-step generation) modes leverage Lightning LoRA; editing uses Rapid-AIO transformer for 4-step processing.
  • Versatile Functionality: Supports text-to-image generation, image editing, multi-image operations, custom LoRA integration, and negative prompts.
  • Unique Features: Includes a "Batman mode" for LEGO Batman photobombs and a "Photo-to-Anime" mode for style transformation.
  • Gradio UI: An experimental graphical interface (qwen-image-mps-gradio) is available.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or project roadmap were found in the provided README content.

Licensing & Compatibility

The license type is not explicitly stated in the provided README content. Therefore, compatibility for commercial use or closed-source linking cannot be determined from this information.

Limitations & Caveats

MPS device usage requires a compatible PyTorch build and native Apple Silicon Python execution. The Gradio UI is explicitly marked as experimental. License details are absent, preventing assessment of usage restrictions.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
2
Star History
12 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google).

NanoBananaEditor by markfulton

0.8%
583
Advanced AI image generation and editing platform
Created 4 months ago
Updated 4 months ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

IP-Adapter by tencent-ailab

0.3%
6k
Adapter for image prompt in text-to-image diffusion models
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.