ppt-image-first by NyxTides

AI PPT generation transforming ideas into visual presentations

Created 2 months ago

1,135 stars

Top 33.3% on SourcePulse

Project Summary

A conversation-first, image-first PPT workflow skill designed for CLI environments like Codex/Claude Code/Opencode. It transforms vague presentation requests into a structured, iterative process, moving from initial requirements through content generation, style preview, and final output, offering a more guided and visually driven alternative to template-based or form-heavy PPT tools. This approach benefits users seeking a collaborative design partner for creating high-fidelity presentation drafts.

How It Works

The core methodology employs an "image-first" strategy, leveraging GPT Image 2 to generate full-page visuals that are then encapsulated within a PPTX container. This contrasts with traditional generators that construct editable native PowerPoint objects. The workflow is segmented into distinct stages: initial intake, content base generation (outputting content_report.md), iterative style proposal and preview using real image mockups, refinement, detailed planning (design_spec.md, slide_blueprint.md, spec_lock.md), and final generation. This staged, conversational approach prioritizes user feedback on direction and preview before locking in final designs.

Quick Start & Requirements

A demo presentation (ppt-image-first-demo-deck.pptx) is available within the project for direct evaluation of the skill's output. Specific installation commands, non-default prerequisites (e.g., GPU, CUDA versions), or detailed setup instructions are not explicitly detailed in the provided documentation.

Highlighted Details

Conversation-First Interaction: User interaction is designed as a dialogue, with the agent acting as a design partner, focusing on feedback for judgments, directions, and previews rather than extensive forms.
Image-First Output: Final deliverables and intermediate previews are generated as full-page images, prioritizing visual fidelity and consistency over individual editable elements.
Content Augmentation: If user-provided material is insufficient, the skill generates a content_report.md to establish a content foundation before visual design commences.
Iterative Style Refinement: Users provide feedback on real image previews, allowing for iterative adjustments to style and direction before finalization.
Structured Planning Outputs: Generates key planning documents including design_spec.md, slide_blueprint.md, and spec_lock.md to define global direction, page-level strategy, and execution constraints.
Integrated Workflow Shells: Utilizes built-in HTML shells (preview_shell, candidate_picker_shell, review_shell) for managing style previews, candidate selection, and review processes.

Maintenance & Community

The project acknowledges the "Linux.do community" for its role in promoting open sharing. No specific details regarding core maintainers, active contributors, sponsorships, or dedicated community channels (e.g., Discord, Slack) are provided.

Licensing & Compatibility

The provided documentation does not specify the project's license type or any associated compatibility notes for commercial use or integration with closed-source systems.

Limitations & Caveats

The output is characterized as high-completion visual drafts, akin to advanced mockups, rather than fully editable native PowerPoint elements, potentially limiting post-generation modification of individual text boxes or shapes. The default aspect ratio is 16:9. The review and retouch stage is an integral part of the core workflow, not an optional add-on.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

120 stars in the last 30 days