Word-As-Image by Shiriluz

Research paper implementation for semantic typography

Created 2 years ago

1,142 stars

Top 33.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Yoland Yan

Cofounder of Comfy Org

Travis Fischer

Founder of Agentic

Project Summary

This repository provides the official implementation for "Word-As-Image for Semantic Typography," a SIGGRAPH 2023 Honorable Mention award-winning technique. It automatically generates stylized typography where letterforms visually represent the word's meaning while maintaining readability, targeting designers and researchers interested in creative text generation.

How It Works

The method leverages large pretrained language-vision models, specifically Stable Diffusion, to distill textual concepts into visual representations. It optimizes the outline of individual letters to convey semantic meaning, guided by the diffusion model. Additional loss functions ensure legibility and preserve the original font's style, resulting in simple, black-and-white designs.

Quick Start & Requirements

Installation: Clone the repository, create a conda environment (conda create --name word python=3.8.15), activate it, and install dependencies using pip and conda as specified in the README. This includes PyTorch with CUDA 11.3, diffusers, transformers, diffvg, and various image processing and SVG libraries.
Prerequisites: CUDA 11.3, Python 3.8.15, HuggingFace access token for Stable Diffusion.
Setup: Requires cloning multiple repositories (main repo, diffvg) and installing numerous Python packages. Estimated setup time is moderate.
Usage: Run experiments via run_word_as_image.sh or python code/main.py with arguments like --semantic_concept, --optimized_letter, and --font.
Links: Official Implementation, SIGGRAPH Paper

Highlighted Details

SIGGRAPH 2023 Honorable Mention Award winner.
Integrates Stable Diffusion with Diffvg for vector graphic optimization.
Focuses on semantic understanding and creative visualization within letterforms.
Generates minimal, flat, 2D vector designs.

Maintenance & Community

The project is associated with its authors from academia. No specific community channels (Discord/Slack) or active maintenance signals are provided in the README.

Licensing & Compatibility

Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This license prohibits commercial use and requires derivative works to be shared under the same or a compatible license.

Limitations & Caveats

The CC BY-NC-SA 4.0 license strictly prohibits commercial use. The implementation relies on specific versions of PyTorch and CUDA, and requires a HuggingFace token, which may pose adoption barriers. Fine-tuning parameters like Lacap loss weight and low-pass filter sigma are suggested for quality adjustments.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days