ALAE  by podgorskiy

Adversarial latent autoencoder for combining generative/representational properties

created 6 years ago
3,524 stars

Top 14.0% on sourcepulse

GitHubView on GitHub
Project Summary

Adversarial Latent Autoencoders (ALAE) is an open-source project providing an architecture that combines the generative capabilities of GANs with the representational power of autoencoders. It aims to address issues of disentangled representations and generative power in autoencoders, targeting researchers and practitioners in computer vision and generative modeling. The primary benefit is enabling high-resolution image generation (e.g., 1024x1024 faces) with comparable quality to GANs, while also supporting reconstruction and manipulation of real images.

How It Works

ALAE introduces a novel autoencoder architecture that leverages recent advancements in GAN training procedures. It can be configured with either a standard MLP encoder or, more notably, a StyleGAN-based generator (StyleALAE). This StyleALAE variant allows for fine-grained control over latent space manipulations, similar to StyleGAN, but within an autoencoder framework. This approach is advantageous as it bridges the gap between generative models and representation learning, offering both high-fidelity synthesis and meaningful latent space properties.

Quick Start & Requirements

  • Install: pip install -r requirements.txt and pip install dareblopy
  • Prerequisites: PyTorch >= v1.3.1, CUDA-capable GPU, cuDNN drivers. For metrics: TensorFlow 1.10, CUDA 9.0.
  • Pre-trained Models: Download via python training_artifacts/download_all.py.
  • Demo: python interactive_demo.py (specify config with -c).
  • Docs: https://github.com/podgorskiy/ALAE

Highlighted Details

  • Achieves 1024x1024 face generation comparable to StyleGAN.
  • Enables face reconstruction and manipulation from real images.
  • Supports various datasets including FFHQ, CelebA, CelebA-HQ, and LSUN Bedroom.
  • Includes scripts for style mixing, reconstructions, traversals, and generation figures.

Maintenance & Community

The project is the official repository for the CVPR 2020 paper "Adversarial Latent Autoencoders." No specific community channels (like Discord/Slack) or active maintenance signals are explicitly mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. However, it references the StyleGAN repository for pre-trained models and metrics code, which are derived from networks originally shared under Apache 2.0 and Creative Commons BY 4.0 licenses. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Running metrics requires a specific, older TensorFlow (1.10) and CUDA (9.0) setup, which may be challenging to manage. Reproducibility on fewer than 8 GPUs might present issues. The project does not explicitly mention support for newer PyTorch versions or alternative installation methods for metric dependencies.

Health Check
Last commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.