ComfyUI-HyperLoRA  by bytedance

ComfyUI tool for parameter-efficient portrait synthesis (CVPR 2025 paper)

created 3 months ago
376 stars

Top 76.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official ComfyUI implementation of HyperLoRA, a parameter-efficient adaptive generation method for personalized portrait synthesis. It addresses limitations of existing methods like LoRA (resource-intensive per-person training) and IP-Adapter (potential lack of naturalness) by generating LoRA weights adaptively, achieving zero-shot personalized generation with high fidelity and editability. The target audience includes researchers and users focused on realistic and controllable portrait generation.

How It Works

HyperLoRA decomposes into Hyper ID-LoRA and Hyper Base-LoRA. The ID-LoRA learns identity information, while the Base-LoRA handles other attributes like background and clothing, preventing feature leakage. During training, only the HyperLoRA modules are updated, keeping the base SDXL model and encoders frozen. At inference, the ID-LoRA is integrated into SDXL for personalization, with the Base-LoRA being optional. This approach merges LoRA's performance with adapter-based zero-shot capabilities.

Quick Start & Requirements

  • Installation: Requires manual placement of downloaded model files into specific ComfyUI subdirectories (models/hyper_lora/, models/insightface/).
  • Prerequisites: ComfyUI, ComfyUI_ADV_CLIP_emb, ComfyUI-Impact-Pack plugins. Requires downloading specific CLIP processor, CLIP ViT, InsightFace (antelopev2), and HyperLoRA model files.
  • Usage: Text prompts must start with trigger words (fcsks, fxhks, fhyks). Recommended stop_at_clip_layer is -2.
  • Compatibility: Tested with LEOSAM's HelloWorld XL 3.0, CyberRealistic XL v1.1, and RealVisXL v4.0. Incompatible with ArienMixXL v4.0.
  • Resources: FP16 precision and distilled Base LoRA reduce model size and GPU memory usage.

Highlighted Details

  • Achieves zero-shot personalized portrait generation with high photorealism, fidelity, and editability.
  • Supports both single and multiple image inputs for personalization.
  • Offers two versions: _fidelity for better detail and _edit for enhanced editability.
  • Can serve as a good initialization for further LoRA training (e.g., ~50 steps for better ID LoRA).

Maintenance & Community

  • Official implementation from ByteDance.
  • Paper accepted to CVPR 2025.
  • Links to project page and arXiv paper are provided.

Licensing & Compatibility

  • Code is licensed under GPL 3.0, requiring derivative works to also be GPL 3.0.
  • Models are licensed under CC BY-NC 4.0, allowing non-commercial sharing and adaptation with credit.
  • Third-party model licenses (SDXL, CLIP, InsightFace) must also be adhered to.
  • GPL 3.0 license may restrict integration into closed-source commercial products.

Limitations & Caveats

The CC BY-NC 4.0 model license restricts commercial use. The project recommends using FaceDetailer or ControlNet for repairing small faces or improving stability due to limited trained face resolution.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
7
Star History
220 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.