Photo customization research paper using stacked ID embedding
Top 5.1% on sourcepulse
PhotoMaker enables rapid, LoRA-free customization of realistic human photos by stacking identity embeddings. It targets researchers and artists seeking to generate diverse, high-fidelity images with precise text control, integrating seamlessly with existing Stable Diffusion XL pipelines and LoRA modules.
How It Works
PhotoMaker employs a stacked ID embedding approach, integrating identity information directly into the diffusion process without requiring additional LoRA training. This method allows for efficient personalization, preserving key facial features and enabling stylistic control through text prompts and compatibility with other LoRA modules.
Quick Start & Requirements
pip install -r requirements.txt
followed by pip install git+https://github.com/TencentARC/PhotoMaker.git
bfloat16
support is recommended for optimal speed; use torch.float16
if unsupported. Minimum GPU memory: 11GB.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
9 months ago
1 day