Photo recrafting research paper with identity preservation using Diffusion Transformers
Top 19.0% on sourcepulse
InfiniteYou (InfU) is a framework for high-fidelity, identity-preserving image generation using Diffusion Transformers (DiTs), specifically targeting users who need to recraft photos while maintaining personal identity. It addresses limitations in existing methods like poor text-image alignment and generation quality by introducing InfuseNet and a multi-stage training strategy.
How It Works
InfU injects identity features into a DiT base model (FLUX.1-dev) via residual connections using InfuseNet. This approach enhances identity similarity without compromising generation capabilities. A multi-stage training process, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further refines text-image alignment, image quality, and mitigates face copy-pasting issues.
Quick Start & Requirements
pip install -r requirements.txt
.python test.py --id_image <path> --prompt <text>
.bytedance/ComfyUI_InfiniteYou
.Highlighted Details
aes_stage2
(better aesthetics/alignment) and sim_stage1
(higher ID similarity).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The model is licensed for non-commercial, academic research purposes only. Users must ensure compliance with local laws and the licenses of all dependencies.
1 week ago
1 day