Pixelle-MCP  by AIDC-AI

Omnimodal agent framework for ComfyUI and LLMs

Created 1 month ago
376 stars

Top 75.5% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Pixelle-MCP is an open-source framework that bridges ComfyUI's visual node-based workflow system with Large Language Models (LLMs) via the MCP protocol. It enables users to transform ComfyUI workflows into callable "MCP Tools" with zero coding, allowing LLMs to dynamically execute complex multimodal AI generation tasks. The target audience includes AI researchers, developers, and power users seeking to integrate sophisticated generative AI capabilities into LLM-driven applications.

How It Works

The core innovation lies in the "Workflow-as-MCP Tool" solution. ComfyUI workflows are exported in an API format, with special syntax in node titles defining parameters and outputs. The MCP server then converts these exported workflows into MCP-compliant tools that LLMs can discover and invoke. This approach leverages the extensive ComfyUI ecosystem for multimodal generation (text, image, sound, video) and integrates with various LLMs through the LiteLLM framework, offering a flexible and extensible platform.

Quick Start & Requirements

  • Installation: Clone the repository, copy config.yml.example to config.yml, and configure ComfyUI service address and LLM models.
  • Prerequisites: A running ComfyUI instance is required. Supports multiple LLMs (OpenAI, Ollama, Gemini, etc.) via LiteLLM.
  • Running: Docker Compose (docker compose up -d) or provided shell scripts (run.sh, run.bat).
  • Access: Client UI at http://localhost:9003, MCP Server at http://localhost:9002/sse.
  • Docs: https://pixelle.ai

Highlighted Details

  • Full-modal support (Text, Image, Sound/Speech, Video) conversion and generation.
  • Zero-code development for converting ComfyUI workflows into MCP Tools.
  • MCP Server and Client architecture for flexible integration with various LLMs and clients.
  • Unified YAML configuration for managing services.
  • Supports a wide range of LLMs including OpenAI, Ollama, Gemini, Claude, and more.

Maintenance & Community

The project is actively developed and welcomes community contributions. Links to Discord and WeChat groups are provided for support and updates.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and closed-source linking.

Limitations & Caveats

Optional parameters in ComfyUI workflows require default values to be set in the node. Fields already connected to other nodes are not parsed as parameters. Tool naming relies on the exported file name.

Health Check
Last Commit

15 hours ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
12
Star History
349 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Shizhe Diao Shizhe Diao(Research Scientist at NVIDIA; Author of LMFlow), and
20 more.

dify by langgenius

0.7%
112k
Open-source LLM app development platform
Created 2 years ago
Updated 13 hours ago
Feedback? Help us improve.