OminiControl by Yuanshi9815

Universal control framework for diffusion transformer models

Created 1 year ago

1,875 stars

Top 22.9% on SourcePulse

Project Summary

OminiControl provides a minimal and universal framework for controlling Diffusion Transformer models, specifically FLUX.1. It enables subject-driven and spatial control (e.g., edge-guided, in-painting) with minimal parameter overhead, making it suitable for researchers and developers looking to enhance generative AI capabilities.

How It Works

OminiControl injects control signals into Diffusion Transformers with a minimal design, adding only ~0.1% to the base model's parameters. This approach preserves the original model architecture while enabling diverse control mechanisms, including subject replication and spatial conditioning like edge-to-image or in-painting.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.12, Conda environment recommended.
Usage: Jupyter notebooks for subject-driven generation, in-painting, and spatial control are provided.
Docs: examples/subject.ipynb, examples/inpainting.ipynb, examples/spatial.ipynb

Highlighted Details

Supports subject-driven generation and various spatial controls (canny, depth, colorization, deblurring).
Offers fine-tuned models for 512x512 and 1024x1024 resolutions, including specialized models like "oye-cartoon".
Training code is available for custom control task development with FLUX models.
Community extensions for ComfyUI are available.

Maintenance & Community

The project has released OminiControl2 with efficient conditioning methods and supports custom style LoRAs. Training code and higher-resolution models have been released. Links to demos and inference examples are provided.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Subject-driven generation is primarily optimized for objects, not human subjects, due to training data limitations. The released models currently support only 512x512 resolution for subject-driven generation, though 1024x1024 models are mentioned as released. The subject-driven model may not perform well with FLUX.1-dev.

OminiControl by Yuanshi9815

Explore Similar Projects

ZenCtrl by FotographerAI

SemanticStyleGAN by seasonSH

Awesome-Controllable-Video-Generation by mayuelala

Radiata by ddPn08

In-Context-LoRA by ali-vilab

EasyControl by Xiaojiu-z

open-pose-editor by ZhUyU1997

OmniGen by VectorSpaceLab

sd-webui-EasyPhoto by aigc-apps

flux by black-forest-labs

sd-webui-controlnet by Mikubill

InvokeAI by invoke-ai