MIGC  by limuloo

Text-to-image synthesis research paper using multi-instance generation control

Created 1 year ago
606 stars

Top 54.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

MIGC and MIGC++ offer advanced control over text-to-image synthesis, enabling users to specify multiple object instances with precise locations and attributes. This is particularly beneficial for researchers and artists requiring fine-grained control over generated imagery, moving beyond simple text prompts to detailed scene composition.

How It Works

MIGC acts as a plug-and-play controller for diffusion models, leveraging spatial conditioning (masks and boxes) to guide instance generation. MIGC++ enhances this by supporting simultaneous box and mask control, and introduces an iterative editing mode ("Consistent-MIG") for modifying specific instances while maintaining overall image consistency. This approach improves instance success rates and reduces attribute leakage compared to baseline methods.

Quick Start & Requirements

  • Installation: Requires Python 3.9 and Conda environment setup. Install via pip install -r requirement.txt and pip install -e ..
  • Checkpoints: Download MIGC_SD14.ckpt (219M) or MIGC++_SD14.ckpt (191M) and place in pretrained_weights/. Note: MIGC_SD14.ckpt can be used with SD1.5.
  • Dependencies: Stable Diffusion, diffusers, CLIP, GLIGEN-GUI.
  • Demo: Colab demo available.
  • GUI: A MIGC-GUI is provided, requiring additional model downloads (CLIPTextModel, CetusMix).

Highlighted Details

  • Achieves state-of-the-art performance on the COCO-MIG benchmark, outperforming methods like InstanceDiffusion in MIOU and Instance Success Rate.
  • Supports integration with LoRA for enhanced attribute and position control.
  • Offers an iterative editing mode (Consistent-MIG) for targeted image modifications.
  • Provides a GUI for more convenient art creation.

Maintenance & Community

The project is supervised by ReLER Lab at Zhejiang University and HUAWEI. Key contributors include Dewei Zhou and Yi Yang. Updates include the release of MIGC++ weights and the iterative editing mode. Contact email: zdw1999@zju.edu.cn.

Licensing & Compatibility

The repository is released under a permissive license, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

Training code is not available due to company requirements. Pretrained weights for SD1.5, SD2, and SDXL are listed as "coming soon." The MIGC-GUI is still under optimization.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.