GPT-Image2-Skill by wuyoscar

AI image generation and editing toolkit for agents and developers

Created 3 months ago

3,877 stars

Top 12.2% on SourcePulse

Project Summary

This repository provides a comprehensive prompt gallery and CLI tool for OpenAI's gpt-image-2 model, designed to streamline AI image generation and editing. It targets AI engineers, researchers, and power users by offering curated, copy-paste prompts and runnable examples for various applications, from research figures to artistic creations, and integrates seamlessly with agentic runtimes.

How It Works

The project leverages OpenAI's gpt-image-2 API to facilitate both text-to-image generation and image editing tasks. It offers a command-line interface (gpt-image) for direct interaction and functions as a "skill" for agent frameworks like Codex, Claude Code, and Hermes Agent. The core value lies in its extensive gallery of 162 curated prompts, categorized for diverse use cases, and its detailed guidance on prompt engineering, enabling users to achieve specific artistic styles, technical illustrations, or photorealistic outputs with greater ease.

Quick Start & Requirements

Installation varies by runtime:

Claude Code: /plugin marketplace add wuyoscar/gpt_image_2_skill followed by /plugin install gpt-image@wuyoscar-skills.
Codex: Use $skill-installer install https://github.com/wuyoscar/gpt_image_2_skill/tree/main/skills/gpt-image or manual git clone and copy.
Manual Agent-Skill: git clone the repository and symlink the skills/gpt-image folder into your agent's skills directory.
CLI: Install via uvx --from git+https://github.com/wuyoscar/gpt_image_2_skill gpt-image or uv tool install ... for a PATH-accessible command.

A prerequisite is an OPENAI_API_KEY accessible via environment variables or ~/.env.

Highlighted Details

Features a gallery of 162 prompts and corresponding image assets.
Supports diverse applications: research figures, UI mockups, anime, photography, typography, tattoo design, and image editing (inpainting, restyling).
Includes a detailed "Prompting Fundamentals" guide based on OpenAI's cookbook, covering prompt structure, specificity, and quality cues.
The CLI offers extensive parameters for controlling image size, quality, format, and editing inputs.

Maintenance & Community

The repository was last updated on April 25, 2026. Contributions are welcomed, with guidelines provided in CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md. No specific community channels (like Discord or Slack) are listed.

Licensing & Compatibility

The project is released under the CC BY 4.0 license, requiring attribution for outside-source prompts. It is designed for compatibility with various agent runtimes that support skill installations.

Limitations & Caveats

Requires access to the OpenAI API and an API key. The input-fidelity parameter is not supported by gpt-image-2 and is thus dropped by the CLI. The prompt gallery, while extensive, is curated and may not cover every conceivable use case.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

595 stars in the last 30 days