imagesorcery-mcp by sunriseapps

Local image processing and recognition for AI assistants

Created 9 months ago

289 stars

Top 91.2% on SourcePulse

Project Summary

ImageSorcery MCP provides a suite of local image processing and recognition tools designed to empower AI assistants. It enables AI agents to perform complex image manipulations, object detection, and text extraction directly on a user's machine, ensuring data privacy and eliminating the need for cloud-based services. This makes it ideal for developers building AI-powered applications that require robust, on-device image handling capabilities.

How It Works

ImageSorcery MCP functions as an MCP (Meta Communication Protocol) server, exposing a variety of image processing tools through a standardized interface. It leverages libraries like OpenCV for fundamental operations, Ultralytics for state-of-the-art object detection and segmentation, and EasyOCR for text extraction. Users interact with these tools via natural language prompts interpreted by an AI assistant, which then orchestrates the appropriate ImageSorcery MCP commands. The core advantage lies in its local execution model, processing all images and data on the user's system without external data transmission.

Quick Start & Requirements

Primary install: pipx install imagesorcery-mcp (recommended).
Prerequisites: Python 3.10 or higher, pipx, and system libraries ffmpeg, libsm6, libxext6, libgl1-mesa-glx (required by OpenCV). An MCP client (e.g., Claude.app, Cline) is necessary for interaction.
Setup: The imagesorcery-mcp --post-install command is crucial for downloading models and attempting to install the clip package. Detailed instructions are provided for manual virtual environment setups and potential issues with uv venv.
Links: Official website: imagesorcery.net.

Highlighted Details

Comprehensive toolset includes: crop, resize, rotate, background removal, drawing (text, shapes, arrows), color manipulation, object detection, OCR, and image overlay.
Supports advanced features like object segmentation masks and text field detection.
Enables complex, multi-step image tasks through natural language prompts directed at an AI assistant.
All operations are performed locally, ensuring user privacy and data security.

Maintenance & Community

The project lists contact points for the author (titulus) and CEO (Vlad Karm) via LinkedIn. Users are encouraged to open issues in the repository for bug reports or feature requests. Specific community channels like Discord or Slack are not detailed in the README.

Licensing & Compatibility

This project is licensed under the MIT License, permitting broad use, modification, and distribution, including for commercial purposes and integration into closed-source applications.

Limitations & Caveats

The installation of the clip Python package, required for text-based image searching, can be complex and may require manual intervention, particularly when using uv venv. Users must have an MCP client configured to communicate with the ImageSorcery MCP server. Certain system libraries may need to be installed separately depending on the operating system or container environment.

imagesorcery-mcp by sunriseapps

Explore Similar Projects

AIDE by shilinyan99

Awesome-Vision-Transformer-Collection by GuanRunwei

geti by open-edge-platform

LLaVA-Plus-Codebase by LLaVA-VL

class.vision by Alireza-Akhavan

Awesome-Anything by VainF

Modern-Computer-Vision-with-PyTorch-2E by PacktPublishing

peinture by Amery2010

gen-cv by Azure

vision-agent by landing-ai

X-AnyLabeling by CVHub520

ddddocr by sml2h3