ALAE by podgorskiy

Adversarial latent autoencoder for combining generative/representational properties

Created 6 years ago

3,529 stars

Top 13.7% on SourcePulse

View on GitHub

6 Experts Love This Project

Jong Wook Kim

Research Scientist at OpenAI

Deepak Pathak

Cofounder of Skild AI; Professor at CMU

Siyuan Zhuang

Coauthor of vLLM

Binyuan Hui

Research Scientist at Alibaba Qwen

and 2 more!

Project Summary

Adversarial Latent Autoencoders (ALAE) is an open-source project providing an architecture that combines the generative capabilities of GANs with the representational power of autoencoders. It aims to address issues of disentangled representations and generative power in autoencoders, targeting researchers and practitioners in computer vision and generative modeling. The primary benefit is enabling high-resolution image generation (e.g., 1024x1024 faces) with comparable quality to GANs, while also supporting reconstruction and manipulation of real images.

How It Works

ALAE introduces a novel autoencoder architecture that leverages recent advancements in GAN training procedures. It can be configured with either a standard MLP encoder or, more notably, a StyleGAN-based generator (StyleALAE). This StyleALAE variant allows for fine-grained control over latent space manipulations, similar to StyleGAN, but within an autoencoder framework. This approach is advantageous as it bridges the gap between generative models and representation learning, offering both high-fidelity synthesis and meaningful latent space properties.

Quick Start & Requirements

Install: pip install -r requirements.txt and pip install dareblopy
Prerequisites: PyTorch >= v1.3.1, CUDA-capable GPU, cuDNN drivers. For metrics: TensorFlow 1.10, CUDA 9.0.
Pre-trained Models: Download via python training_artifacts/download_all.py.
Demo: python interactive_demo.py (specify config with -c).
Docs: https://github.com/podgorskiy/ALAE

Highlighted Details

Achieves 1024x1024 face generation comparable to StyleGAN.
Enables face reconstruction and manipulation from real images.
Supports various datasets including FFHQ, CelebA, CelebA-HQ, and LSUN Bedroom.
Includes scripts for style mixing, reconstructions, traversals, and generation figures.

Maintenance & Community

The project is the official repository for the CVPR 2020 paper "Adversarial Latent Autoencoders." No specific community channels (like Discord/Slack) or active maintenance signals are explicitly mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. However, it references the StyleGAN repository for pre-trained models and metrics code, which are derived from networks originally shared under Apache 2.0 and Creative Commons BY 4.0 licenses. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Running metrics requires a specific, older TensorFlow (1.10) and CUDA (9.0) setup, which may be challenging to manage. Reproducibility on fewer than 8 GPUs might present issues. The project does not explicitly mention support for newer PyTorch versions or alternative installation methods for metric dependencies.

Health Check

Last Commit

5 years ago

Responsiveness

1+ week

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days