adaptive-style-transfer by CompVis

Style transfer research paper implementation (ECCV 2018)

Created 7 years ago

743 stars

Top 46.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Anastasis Germanidis

Cofounder of Runway

Project Summary

This repository provides the source code for ECCV 2018 paper "A Style-Aware Content Loss for Real-time HD Style Transfer". It enables users to perform high-definition artistic style transfer on images and videos in real-time, targeting researchers and developers in computer vision and graphics.

How It Works

The project implements a style-aware content loss mechanism for neural style transfer. This approach enhances the quality and realism of stylized outputs by considering perceptual features beyond simple texture matching, allowing for more faithful preservation of content structure while adapting to artistic styles.

Quick Start & Requirements

Install: Python 2.7 or 3.6+ with TensorFlow 1.2 (or 1.12.0), PIL, NumPy, SciPy, and tqdm.
Inference: Download pretrained models (e.g., van Gogh) and sample photographs. Run CUDA_VISIBLE_DEVICES=0 python main.py --model_name=model_van-gogh --phase=inference --image_size=1280.
Training: Requires Places365-Standard dataset (105GB) for content images and specific style image archives. Launch training with CUDA_VISIBLE_DEVICES=1 python main.py --model_name=model_van-gogh_new --batch_size=1 --phase=train --image_size=768 --lr=0.0002 --dsr=0.8 --ptcd=/path/to/Places2/data_large --ptad=./data/vincent-van-gogh_road-with-cypresses-1890.
Resources: Training requires significant disk space (105GB+). Inference can utilize CPU if GPU memory is insufficient.
Links: Website, Paper

Highlighted Details

Supports real-time HD style transfer.
Offers pretrained models for multiple artists (Cezanne, Kandinsky, Monet, Picasso, van Gogh, etc.).
Includes scripts for video stylization using FFmpeg for frame splitting and reassembly.
Provides evaluation metrics and an artist classification model.

Maintenance & Community

The project originates from CompVis, a research group known for significant contributions to computer vision. No specific community channels (like Discord/Slack) or active maintenance signals are evident in the README.

Licensing & Compatibility

License: GNU General Public License v3.0 or later.
Compatibility: The GPL license may impose restrictions on use within closed-source commercial applications due to its copyleft nature.

Limitations & Caveats

The project relies on older versions of TensorFlow (1.x) and Python 2.7, which may present compatibility challenges with modern development environments. The large dataset requirements for training also pose a significant barrier.

Health Check

Last Commit

5 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days