SemanticStyleGAN  by seasonSH

Image synthesis research paper (CVPR 2022)

created 3 years ago
272 stars

Top 95.5% on sourcepulse

GitHubView on GitHub
Project Summary

SemanticStyleGAN provides official code for a CVPR 2022 paper, enabling compositional image synthesis and fine-grained editing by modeling local semantic parts separately. It targets researchers and developers working with GANs who need more control over image generation and manipulation than standard StyleGANs offer. The key benefit is enhanced disentanglement between spatial areas for more precise control.

How It Works

SemanticStyleGAN trains a generator to synthesize images by composing local semantic parts, each controlled by a distinct latent code. This compositional approach, detailed in the CVPR 2022 paper, allows for separate control over the structure and texture of different image regions. This design choice leads to stronger disentanglement between spatial areas compared to global latent code control in standard StyleGANs.

Quick Start & Requirements

  • Install via pip install -r requirements.txt.
  • Requires Python 3 and PyTorch 1.8+.
  • Pretrained models are available for CelebAMask-HQ, BitMoji, MetFaces, and Toonify.
  • Official documentation and inference scripts for synthesis, inversion, and metrics are provided.

Highlighted Details

  • Achieves fine-grained control over synthesized and real images when combined with existing StyleGAN editing methods.
  • Supports domain adaptation for new datasets via fine-tuning.
  • Offers optimization-based inversion for real images into the model's latent space.
  • Includes scripts for visualizing random synthesis, local latent interpolation, and component synthesis.

Maintenance & Community

The project is the official implementation for a CVPR 2022 paper. No specific community channels or active maintenance signals are mentioned in the README.

Licensing & Compatibility

The core StyleGAN2 implementation is MIT licensed. However, CUDA files are provided under the Nvidia Source Code License-NC, which may restrict commercial use or linking with closed-source applications.

Limitations & Caveats

The CUDA files are under a non-commercial license, potentially limiting commercial applications. The README does not detail specific hardware requirements beyond PyTorch compatibility, nor does it mention potential performance bottlenecks or known bugs.

Health Check
Last commit

3 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
4 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
created 4 years ago
updated 1 year ago
Feedback? Help us improve.