PosterCraft  by Ephemeral182

Unified framework for aesthetic poster generation

Created 3 months ago
494 stars

Top 62.6% on SourcePulse

GitHubView on GitHub
Project Summary

PosterCraft is a unified framework for generating high-quality aesthetic posters, targeting users who need precise text rendering, seamless integration of artistic elements, and striking visual layouts. It offers a comprehensive solution for creating visually appealing posters with stylistic harmony.

How It Works

PosterCraft employs a four-stage training workflow to achieve its poster generation capabilities. It begins with Text Rendering Optimization for accurate text placement on backgrounds, followed by High-quality Poster Fine-tuning using Region-aware Calibration for style and text-background harmony. Aesthetic-Text RL is then applied for higher-order aesthetic trade-offs and defect mitigation, culminating in Vision-Language Feedback for iterative refinement and multi-modal corrections. This layered approach ensures both fidelity and aesthetic appeal.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create -n postercraft python=3.11), activate it (conda activate postercraft), and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.11, CUDA (implied for GPU inference).
  • Inference: Run python inference.py or python inference_offload.py for memory-limited GPUs. Requires specifying prompt, pipeline path (e.g., "black-forest-labs/FLUX.1-dev"), and custom transformer path (e.g., "PosterCraft/PosterCraft-v1_RL").
  • Demo: A Gradio web UI is available via python demo_gradio.py.
  • Resources: Model weights and datasets are available on HuggingFace.

Highlighted Details

  • Achieves state-of-the-art text rendering accuracy, outperforming several open and closed-source models in quantitative benchmarks (Text Recall, F-score, Accuracy).
  • Utilizes specialized datasets: Text-Render-2M (2M text rendering examples), HQ-Poster-100K (100K curated posters), Poster-Preference-100K (100K preference pairs for RL), and Poster-Reflect-120K (120K vision-language feedback pairs).
  • Offers two fine-tuned model weights: PosterCraft-v1_RL (Stage 3) and PosterCraft-v1_Reflect (Stage 4).
  • Integrates with ComfyUI via community contributions.

Maintenance & Community

The project is associated with The Hong Kong University of Science and Technology (Guangzhou) and Meituan. Updates include community integrations and Chinese article releases. Contact information for authors is provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Model weights are available on HuggingFace, implying their usage is governed by HuggingFace's terms.

Limitations & Caveats

The README does not detail specific limitations or known bugs. The project appears to be relatively new, with initial releases in June 2025.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Chaoyu Yang Chaoyu Yang(Founder of Bento), and
11 more.

IF by deep-floyd

0.0%
8k
Text-to-image model for photorealistic synthesis and language understanding
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.