gpt-image2-ppt-skills  by JuneYaooo

AI-powered PPT generation and template cloning

Created 1 month ago
691 stars

Top 48.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the automation of visually compelling presentation generation. It enables users, including AI assistants, to create high-definition presentations from natural language prompts or by cloning the style of existing .pptx templates. The primary benefit is rapid, high-quality PPT creation with flexible styling and dual output formats.

How It Works

The system leverages OpenAI's gpt-image-2 model to generate presentation visuals. Its core innovation lies in two modes: applying one of ten curated visual styles (e.g., Spatial Glass, Riso) or entering a "template clone" mode. In the latter, the tool analyzes an input .pptx or image to extract layout, color, and illustration semantics, then reapplies them to new content. This process requires rendering .pptx files to images using external tools like LibreOffice. Outputs include an interactive HTML viewer and a .pptx file.

Quick Start & Requirements

Installation is streamlined via AI assistants (Claude Code, Codex, etc.) by providing a specific prompt referencing docs/install.md. Alternatively, manual installation involves cloning the repository and running install_as_skill.sh with a target agent (claude or codex). Key requirements include Python 3.8+, and for template cloning, a local libreoffice installation or the linuxserver/libreoffice Docker image. OpenAI API access (key and base URL) is needed for direct API mode; Codex CLI backend bypasses this.

Highlighted Details

  • 10 Built-in Styles: Features diverse aesthetics like gradient-glass, dark-aurora, risograph, and y2k-chrome, each with distinct cover, content, and data slide compositions.
  • Template Cloning: Enables precise replication of layout, color schemes, and illustration styles from any provided .pptx or reference image.
  • Dual Output: Generates both a keyboard/touch-navigable HTML viewer and a ready-to-use 16:9 .pptx file.
  • Parallel Generation: Achieves rapid output, generating approximately 10 pages in ~30 seconds.
  • Flexible Backend: Supports direct OpenAI API calls or integrates as a skill for agents like Codex, reducing API key management overhead for Codex users.

Maintenance & Community

The project acknowledges inspiration from op7418/NanoBanana-PPT-Skills and lewislulu/html-ppt-skill. Community engagement is fostered through the "LINUX DO" Chinese developer community and a dedicated WeChat group.

Licensing & Compatibility

The project is licensed under the Apache License 2.0, which generally permits commercial use and derivative works, subject to license terms. No specific compatibility restrictions are noted.

Limitations & Caveats

Template cloning necessitates either a local libreoffice installation or Docker, introducing external dependencies. Direct OpenAI API usage requires managing API keys and incurring costs. Integration is primarily geared towards specific AI agent ecosystems like Claude Code and Codex.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
8
Star History
685 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.