smart-illustrator by axtonliu

AI for contextual article illustrations and cover generation

Created 5 months ago

518 stars

Top 59.8% on SourcePulse

Project Summary

This project addresses the time-consuming nature of creating contextual illustrations for articles and content. It targets newsletter writers, YouTube creators, and technical bloggers by offering an AI-powered solution that automatically generates relevant visuals, including high-CTR YouTube thumbnails, in minutes, thereby enhancing content engagement and visual consistency.

How It Works

The core innovation is a "Tri-Engine System" that intelligently selects between Gemini (for creative visuals), Excalidraw (for hand-drawn diagrams), or Mermaid (for structured diagrams) based on content analysis. This is complemented by "Smart Position Detection" to identify optimal placement points within articles and a "Cover Learning System" that incorporates YouTube thumbnail best practices. This multi-engine approach automates diverse illustration needs, from conceptual art to technical diagrams, with a focus on contextual relevance and efficiency.

Quick Start & Requirements

Installation involves cloning the repository into the Claude Code Skills directory: git clone https://github.com/axtonliu/smart-illustrator.git ~/.claude/skills/smart-illustrator. Key prerequisites include the Claude Code CLI, Bun runtime, and Mermaid CLI (npm install -g @mermaid-js/mermaid-cli). For Excalidraw export, additional dependencies require running npm install && npx playwright install firefox within the scripts directory. An optional Gemini API Key is needed for creative visuals. A full walkthrough is available at https://youtu.be/TbyJ3imLuXQ.

Highlighted Details

Tri-Engine System: Auto-selects Gemini, Excalidraw, or Mermaid based on content type.
Smart Position Detection: Analyzes article structure for optimal illustration points.
Diverse Illustration Types: Supports over 10 types including flowcharts, mindmaps, scenes, and metaphors.
Cover Mode: Generates high-CTR YouTube thumbnails and social media covers with built-in best practices.
Extensible Styles: Offers Light, Dark, Minimal, and custom styles, with brand customization options.
Multi-Platform Output: Presets for YouTube, WeChat, Twitter, and Xiaohongshu.
Resume Generation: Allows interrupted generation processes to be resumed.

Maintenance & Community

The project is explicitly marked as "Experimental" and a "public prototype." The primary focus is on demonstrating system integration rather than active codebase maintenance. Contributions are welcomed for reproducible bug reports, documentation, and small PRs, but feature requests may not be addressed due to limited maintenance capacity. No specific community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

The project is released under the permissive MIT License. This license generally allows for broad compatibility, including commercial use and integration into closed-source projects, with minimal restrictions beyond attribution.

Limitations & Caveats

As an experimental prototype, the system does not yet cover all input scales or edge cases, and output quality can fluctuate based on model versions and input structure. The developer's primary goal is showcasing system mechanics, not maintaining the codebase, meaning users encountering issues should provide detailed, reproducible bug reports.

Health Check

Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

35 stars in the last 30 days