MIGC by limuloo

Text-to-image synthesis research paper using multi-instance generation control

Created 2 years ago

615 stars

Top 53.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

MIGC and MIGC++ offer advanced control over text-to-image synthesis, enabling users to specify multiple object instances with precise locations and attributes. This is particularly beneficial for researchers and artists requiring fine-grained control over generated imagery, moving beyond simple text prompts to detailed scene composition.

How It Works

MIGC acts as a plug-and-play controller for diffusion models, leveraging spatial conditioning (masks and boxes) to guide instance generation. MIGC++ enhances this by supporting simultaneous box and mask control, and introduces an iterative editing mode ("Consistent-MIG") for modifying specific instances while maintaining overall image consistency. This approach improves instance success rates and reduces attribute leakage compared to baseline methods.

Quick Start & Requirements

Installation: Requires Python 3.9 and Conda environment setup. Install via pip install -r requirement.txt and pip install -e ..
Checkpoints: Download MIGC_SD14.ckpt (219M) or MIGC++_SD14.ckpt (191M) and place in pretrained_weights/. Note: MIGC_SD14.ckpt can be used with SD1.5.
Dependencies: Stable Diffusion, diffusers, CLIP, GLIGEN-GUI.
Demo: Colab demo available.
GUI: A MIGC-GUI is provided, requiring additional model downloads (CLIPTextModel, CetusMix).

Highlighted Details

Achieves state-of-the-art performance on the COCO-MIG benchmark, outperforming methods like InstanceDiffusion in MIOU and Instance Success Rate.
Supports integration with LoRA for enhanced attribute and position control.
Offers an iterative editing mode (Consistent-MIG) for targeted image modifications.
Provides a GUI for more convenient art creation.

Maintenance & Community

The project is supervised by ReLER Lab at Zhejiang University and HUAWEI. Key contributors include Dewei Zhou and Yi Yang. Updates include the release of MIGC++ weights and the iterative editing mode. Contact email: zdw1999@zju.edu.cn.

Licensing & Compatibility

The repository is released under a permissive license, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

Training code is not available due to company requirements. Pretrained weights for SD1.5, SD2, and SDXL are listed as "coming soon." The MIGC-GUI is still under optimization.

Health Check

Last Commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days