Text-to-image synthesis research paper using multi-instance generation control
Top 54.8% on sourcepulse
MIGC and MIGC++ offer advanced control over text-to-image synthesis, enabling users to specify multiple object instances with precise locations and attributes. This is particularly beneficial for researchers and artists requiring fine-grained control over generated imagery, moving beyond simple text prompts to detailed scene composition.
How It Works
MIGC acts as a plug-and-play controller for diffusion models, leveraging spatial conditioning (masks and boxes) to guide instance generation. MIGC++ enhances this by supporting simultaneous box and mask control, and introduces an iterative editing mode ("Consistent-MIG") for modifying specific instances while maintaining overall image consistency. This approach improves instance success rates and reduces attribute leakage compared to baseline methods.
Quick Start & Requirements
pip install -r requirement.txt
and pip install -e .
.MIGC_SD14.ckpt
(219M) or MIGC++_SD14.ckpt
(191M) and place in pretrained_weights/
. Note: MIGC_SD14.ckpt
can be used with SD1.5.Highlighted Details
Maintenance & Community
The project is supervised by ReLER Lab at Zhejiang University and HUAWEI. Key contributors include Dewei Zhou and Yi Yang. Updates include the release of MIGC++ weights and the iterative editing mode. Contact email: zdw1999@zju.edu.cn.
Licensing & Compatibility
The repository is released under a permissive license, allowing for commercial use and integration with closed-source projects.
Limitations & Caveats
Training code is not available due to company requirements. Pretrained weights for SD1.5, SD2, and SDXL are listed as "coming soon." The MIGC-GUI is still under optimization.
2 months ago
1 day