GAN-based research paper for facial video editing
Top 33.2% on sourcepulse
STIT (Stitch it in Time) addresses the challenge of semantic facial editing in real videos using Generative Adversarial Networks (GANs). It targets researchers and practitioners in computer vision and graphics who need to perform high-quality, temporally coherent facial manipulations on videos, offering significant improvements over existing methods for talking-head videos.
How It Works
STIT leverages the inherent temporal consistency of source videos and the strong prior learned by StyleGAN's latent space. By carefully managing the editing pipeline and avoiding careless treatment of video components, it minimizes deviations from the natural temporal flow. The framework utilizes StyleGAN2-ADA and incorporates a "stitching tuning" process to enhance temporal coherence, effectively "stitching" edits across frames.
Quick Start & Requirements
pip install -r requirements.txt
. For StyleCLIP edits, run pip install git+https://github.com/openai/CLIP.git
.configs/path_config.py
. Videos need to be split into individual frames (e.g., using ffmpeg
).Highlighted Details
Maintenance & Community
The project is associated with the authors' research and academic work. Links to relevant research papers and underlying project licenses are provided.
Licensing & Compatibility
The project incorporates components with various licenses: NVIDIA Source Code License (StyleGAN2-ADA), MIT (PTI, e4e, StyleCLIP, face-parsing.PyTorch), BSD 2-Clause (LPIPS), and Creative Commons NonCommercial (stylegan2-distillation). The non-commercial clause from stylegan2-distillation may restrict commercial use.
Limitations & Caveats
The project relies on specific versions of PyTorch and CUDA. Some components have non-commercial licenses, potentially limiting broader adoption. The effectiveness on out-of-domain videos requires specific parameter tuning.
3 years ago
Inactive