ComfyUI-HyperLoRA by bytedance

ComfyUI tool for parameter-efficient portrait synthesis (CVPR 2025 paper)

Created 10 months ago

483 stars

Top 63.6% on SourcePulse

Project Summary

This repository provides the official ComfyUI implementation of HyperLoRA, a parameter-efficient adaptive generation method for personalized portrait synthesis. It addresses limitations of existing methods like LoRA (resource-intensive per-person training) and IP-Adapter (potential lack of naturalness) by generating LoRA weights adaptively, achieving zero-shot personalized generation with high fidelity and editability. The target audience includes researchers and users focused on realistic and controllable portrait generation.

How It Works

HyperLoRA decomposes into Hyper ID-LoRA and Hyper Base-LoRA. The ID-LoRA learns identity information, while the Base-LoRA handles other attributes like background and clothing, preventing feature leakage. During training, only the HyperLoRA modules are updated, keeping the base SDXL model and encoders frozen. At inference, the ID-LoRA is integrated into SDXL for personalization, with the Base-LoRA being optional. This approach merges LoRA's performance with adapter-based zero-shot capabilities.

Quick Start & Requirements

Installation: Requires manual placement of downloaded model files into specific ComfyUI subdirectories (models/hyper_lora/, models/insightface/).
Prerequisites: ComfyUI, ComfyUI_ADV_CLIP_emb, ComfyUI-Impact-Pack plugins. Requires downloading specific CLIP processor, CLIP ViT, InsightFace (antelopev2), and HyperLoRA model files.
Usage: Text prompts must start with trigger words (fcsks, fxhks, fhyks). Recommended stop_at_clip_layer is -2.
Compatibility: Tested with LEOSAM's HelloWorld XL 3.0, CyberRealistic XL v1.1, and RealVisXL v4.0. Incompatible with ArienMixXL v4.0.
Resources: FP16 precision and distilled Base LoRA reduce model size and GPU memory usage.

Highlighted Details

Achieves zero-shot personalized portrait generation with high photorealism, fidelity, and editability.
Supports both single and multiple image inputs for personalization.
Offers two versions: _fidelity for better detail and _edit for enhanced editability.
Can serve as a good initialization for further LoRA training (e.g., ~50 steps for better ID LoRA).

Maintenance & Community

Official implementation from ByteDance.
Paper accepted to CVPR 2025.
Links to project page and arXiv paper are provided.

Licensing & Compatibility

Code is licensed under GPL 3.0, requiring derivative works to also be GPL 3.0.
Models are licensed under CC BY-NC 4.0, allowing non-commercial sharing and adaptation with credit.
Third-party model licenses (SDXL, CLIP, InsightFace) must also be adhered to.
GPL 3.0 license may restrict integration into closed-source commercial products.

Limitations & Caveats

The CC BY-NC 4.0 model license restricts commercial use. The project recommends using FaceDetailer or ControlNet for repairing small faces or improving stability due to limited trained face resolution.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days