Image/video editing research paper using StyleGAN3
Top 50.6% on sourcepulse
This repository provides the official implementation for "Third Time's the Charm? Image and Video Editing with StyleGAN3," focusing on analyzing and leveraging the StyleGAN3 architecture for image and video manipulation. It's designed for researchers and practitioners in generative AI and computer vision who want to explore advanced editing capabilities beyond StyleGAN2.
How It Works
The project analyzes StyleGAN3's latent spaces, finding W/W+ spaces more entangled than StyleGAN2's, recommending StyleSpace for fine-grained editing. It introduces a novel encoder trained on aligned data that can invert unaligned images, and a video editing workflow that reduces texture sticking and expands the field of view using a fine-tuned StyleGAN3 generator.
Quick Start & Requirements
environment/sg3_env.yaml
.pretrained_models
directory.Highlighted Details
Maintenance & Community
The project is associated with its authors from Tel Aviv University and Adobe. Links to relevant papers and codebases are provided.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. However, it heavily relies on and acknowledges the official StyleGAN3 repository, which is typically under a permissive license like MIT. Compatibility for commercial use would depend on the specific license of the underlying StyleGAN3 implementation used.
Limitations & Caveats
The README notes that W/W+ latent spaces in StyleGAN3 are more entangled than StyleGAN2's. While CPU support might be possible with modifications, it's not inherently supported. Training custom InterFaceGAN boundaries requires generating large datasets of latent codes and attribute scores.
2 years ago
Inactive