SwarmUI  by mcmonkeyprojects

Web UI for AI image/video generation, emphasizing power-tools and extensibility

created 1 year ago
2,905 stars

Top 16.8% on sourcepulse

GitHubView on GitHub
Project Summary

SwarmUI is a modular, high-performance web UI for AI image and video generation, supporting Stable Diffusion, Flux, and various video models. It targets both beginners with an intuitive "Generate" tab and advanced users with a "Comfy Workflow" tab, aiming to be a comprehensive tool for AI media creation.

How It Works

SwarmUI is built with extensibility and performance in mind, leveraging a modular architecture. It supports a wide range of AI models, including Stable Diffusion and Flux for images, and LTX-V, Hunyuan Video, and Cosmos for video, with future plans for audio and more. The UI offers a user-friendly "Generate" tab for ease of use and a "Comfy Workflow" tab for direct access to underlying graph structures, facilitating advanced customization and experimentation.

Quick Start & Requirements

  • Windows: Download Install-Windows.bat. Requires Git and .NET 8 SDK (automated on Win 11).
  • Linux: Download install-linux.sh. Requires Git, Python 3.10-3.12 (with pip and venv), and .NET 8 SDK.
  • macOS: Requires M-Series Apple silicon. Install via brew install dotnet python@3.11 virtualenv, then clone and run ./launch-macos.sh.
  • Docker: See Docs/Docker.md for instructions.
  • Colab/Cloud: Links provided for Google Colab and Runpod/Vast.ai.
  • Setup: Initial setup may take several minutes due to model downloads.

Highlighted Details

  • Supports a broad spectrum of AI image and video models.
  • Offers both a simplified "Generate" tab and an advanced "Comfy Workflow" tab.
  • Designed for extensibility with a modular architecture.
  • Includes powertools like a Grid Generator and image editor.

Maintenance & Community

  • Beta status, with active development and calls for PRs.
  • Discord community for support and announcements.
  • Feature Announcements Thread available.

Licensing & Compatibility

SwarmUI itself is licensed under the MIT License. However, it embeds or can auto-install components with various licenses, including LGPL (7-zip), GPL (ComfyUI), AGPL (AUTOMATIC1111, ultralytics), Apache2 (improved-aesthetic-predictor, ImageSharp, Material Symbols), and MIT (various Python packages, insightface, web assets). Users must be aware of potential AGPL/GPL implications when using these components, and model licenses are separate.

Limitations & Caveats

The project is in Beta, with noted areas for improvement including better mobile browser support, native LLM-assisted prompting integration, and more convenient self-contained installers (e.g., .exe, .msi). Some Linux installations may require manual .NET runtime setup.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
22
Issues (30d)
50
Star History
490 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
1 more.

SillyTavern by SillyTavern

3.2%
17k
LLM frontend for power users
created 2 years ago
updated 3 days ago
Feedback? Help us improve.