InstantID  by instantX-research

ID-Preserving generation research paper using a single image

created 1 year ago
11,726 stars

Top 4.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

InstantID offers a zero-shot, single-image method for identity-preserving image generation, targeting researchers and artists. It enables users to generate diverse images while maintaining a specific identity, offering flexibility across various downstream tasks.

How It Works

InstantID leverages a novel tuning-free approach by integrating an Identity Encoder with Stable Diffusion. It uses a pre-trained ControlNet (IdentityNet) to capture identity information from a single input image and an IP-Adapter to inject this identity into the generation process. This combination allows for high-fidelity identity preservation without requiring model fine-tuning, offering a balance between identity retention and prompt controllability.

Quick Start & Requirements

  • Install: pip install opencv-python transformers accelerate insightface diffusers
  • Prerequisites: Python 3.x, PyTorch, CUDA (for GPU acceleration), antelopev2 face encoder model (manual download required).
  • Setup: Requires downloading multiple model checkpoints. Estimated setup time depends on download speeds.
  • Demos: Huggingface Spaces, ModelScope, OpenXLab.

Highlighted Details

  • Achieves better fidelity and text editability compared to previous tuning-free methods.
  • Competitive results against character LoRAs without requiring multiple images or training.
  • Flexible integration of face and background, especially in non-realistic styles.
  • Compatible with LCM-LoRA for accelerated inference.
  • Pipeline merged into diffusers library.

Maintenance & Community

  • Developed by InstantX Team, with contributions from Xiaohongshu Inc. and Peking University.
  • Active development with recent updates including LCM acceleration and Multi-ControlNet support.
  • Community resources include Replicate, WebUI, and ComfyUI integrations.
  • Contact: haofanwang.ai@gmail.com or wangqixun.ai@gmail.com.

Licensing & Compatibility

  • Code released under Apache License 2.0, permitting commercial use.
  • Face models (antelopev2) and released checkpoints are for non-commercial research purposes only, as per insightface's license.

Limitations & Caveats

The project currently does not support multi-person generation, processing only the largest detected face. The licensing for the face encoder and checkpoints restricts their use to research purposes, which may impact commercial applications.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
181 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.