AgentSkillsCN

create_image

根据用户提示生成图像。当用户希望通过AI生成、创作或编辑图像时使用此功能。可通过“创建……的图像”“生成一张图片”“绘制”“制作一幅插画”“可视化”“编辑这张图像”“修改照片”“更换背景”等指令触发。支持文本转图像生成与带遮罩的图像编辑。不适用于图像分析或描述——仅用于创建或修改视觉内容。

SKILL.md
--- frontmatter
name: create_image
description: |
  Creates image from user prompt. Use when the user wants to generate, create, or edit images using AI. Triggers: "create an image
  of", "generate a picture", "draw", "make an illustration", "visualize", "edit this image", "modify
  the photo", "change the background". Supports text-to-image generation and image editing with
  masks. NOT for image analysis or description—only
  for creating or modifying visual content.

Create Image

Overview

This skill generates images from natural language descriptions or edits existing images using AI image generation models. It provides a unified interface supporting multiple vendors:

  • Google Gemini (default) - Fast drafts and high-quality pro output
  • OpenAI - HD quality with optional mask-based editing

Quick Start

bash
# Create output directory first
mkdir -p ./output

# Generate an image (Google, default)
python3 .claude/skills/create_image/image_gen.py "A sunset over mountains" -o ./output/sunset.png

# Generate with OpenAI
python3 .claude/skills/create_image/image_gen.py "A sunset over mountains" --vendor openai -o ./output/sunset.png

# High quality
python3 .claude/skills/create_image/image_gen.py "Detailed portrait" --hq -o ./output/portrait.png

# Edit an existing image
python3 .claude/skills/create_image/image_gen.py "Make the shirt green" --reference ./photo.jpg -o ./output/edited.png

Vendors

VendorFlagModelsBest For
Google Gemini--vendor google (default)gemini-2.5-flash-image (default), gemini-3-pro-image-preview (--hq)Fast iterations, aspect ratio control
OpenAI--vendor openaigpt-image-1HD quality, mask-based targeted edits

API Keys

VendorEnvironment Variable
GoogleGEMINI_API_KEY
OpenAIOPENAI_API_KEY

CLI Reference

Basic Usage

bash
python3 .claude/skills/create_image/image_gen.py "<prompt>" [options]

# Prompt from file
python3 .claude/skills/create_image/image_gen.py -p <prompt-file> [options]

Note: Output files should be written to ./output/ (writable workspace directory).

Options

FlagLong FormDefaultDescription
(positional)NoneInline text prompt
-p--prompt-fileNonePath to file containing prompt (.txt, .md, etc.)
-o--outputgenerated_image.pngOutput file path
-r--aspect-ratio1:1Aspect ratio (1:1, 16:9, 9:16, etc.)
--vendorgoogleVendor: google or openai
--hqoffHigh quality mode
-m--model(vendor default)Override model
-v--verboseoffVerbose output
--api-key(from env)API key override
--referenceNoneReference image (enables edit mode)
--maskNoneMask image (OpenAI only)

Prompt Sources

You can provide the prompt in two ways:

  1. Inline (positional argument): python3 .claude/skills/create_image/image_gen.py "your prompt here"
  2. From file: python3 .claude/skills/create_image/image_gen.py -p ./prompt.txt

The file option is useful for:

  • Long, detailed prompts
  • Reusable prompt templates
  • Multi-line prompts with formatting

Supported Aspect Ratios

1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9


Programmatic API

generate_image()

python
from image_gen import generate_image

result = generate_image(
    prompt="A sunset over mountains",
    output_path="./output/sunset.png",  # optional
    aspect_ratio="16:9",                # default: "1:1"
    vendor="google",                    # or "openai"
    high_quality=False,                 # True for pro/HD
    model=None,                         # override default model
    api_key=None,                       # override env variable
)

edit_image()

python
from image_gen import edit_image

result = edit_image(
    prompt="Make the shirt green",
    reference_image="./photo.jpg",
    output_path="./output/edited.png",  # optional
    vendor="google",                    # or "openai"
    high_quality=False,                 # True for pro/HD
    model=None,                         # override default model
    mask_image=None,                    # OpenAI only: path to mask
    api_key=None,                       # override env variable
)

Return Value

Both functions return the same dictionary:

python
{
    "success": bool,           # True if successful
    "image_path": str | None,  # Path where image was saved
    "text": str | None,        # Text response (Google only)
    "error": str | None,       # Error message if failed
    "image": PIL.Image.Image | None,  # PIL Image object
    "model": str,              # Model that was used
    "vendor": str,             # Vendor that was used
}

Instructions for Claude

Step 1: Determine Parameters

From the user's request, extract:

  • Prompt: The image description or edit instruction (required)
    • For long prompts, save to a file and use -p
  • Vendor: User preference, or default to google
  • Mode: Generate (no reference) or Edit (reference provided)
  • Quality: Standard (default) or high (--hq)
  • Aspect ratio: Based on intended use (default: 1:1)
  • Output path: Where to save the image

Step 2: Choose Vendor

ScenarioRecommended Vendor
Default / fast iterationsgoogle
User has only OpenAI keyopenai
Need mask-based targeted editsopenai
User explicitly requestsAs specified

Step 3: Execute

bash
# Create output directory
mkdir -p ./output

# Generate image
python3 .claude/skills/create_image/image_gen.py "<prompt>" --vendor <vendor> -r <aspect-ratio> -o ./output/<filename>.png [--hq]

# Edit image
python3 .claude/skills/create_image/image_gen.py "<prompt>" --reference ./<image> -o ./output/<filename>.png

Programmatic (via inline python):

bash
python3 << 'EOF'
import sys
sys.path.insert(0, '.claude/skills/create_image')
from image_gen import generate_image, edit_image

result = generate_image(prompt="...", vendor="google", aspect_ratio="16:9", output_path="./output/image.png")
EOF

Step 4: Handle Result

python
if result["success"]:
    print(f"Saved to: {result['image_path']}")
else:
    print(f"Error: {result['error']}")

Examples

Text-to-Image Generation

bash
mkdir -p ./output

# Inline prompt (short)
python3 .claude/skills/create_image/image_gen.py "A cartoon cat wizard" -o ./output/wizard_cat.png

# Widescreen landscape
python3 .claude/skills/create_image/image_gen.py "Mountain panorama at sunset" -r 16:9 -o ./output/mountains.png

# High quality with Google Pro
python3 .claude/skills/create_image/image_gen.py "Detailed portrait" --hq -r 3:4 -o ./output/portrait.png

# Using OpenAI
python3 .claude/skills/create_image/image_gen.py "Steampunk clockwork" --vendor openai -o ./output/steampunk.png

Image Editing

bash
# Edit with inline prompt
python3 .claude/skills/create_image/image_gen.py "Make the background a beach" --reference ./portrait.jpg -o ./output/beach.png

# Edit with Google Pro
python3 .claude/skills/create_image/image_gen.py "Change shirt to green" --reference ./person.jpg --hq -o ./output/green.png

# Targeted edit with OpenAI mask
python3 .claude/skills/create_image/image_gen.py "Replace background with space" --vendor openai --reference ./portrait.png --mask ./bg_mask.png -o ./output/space.png

Mask Image Guidelines (OpenAI only)

For targeted image editing with OpenAI:

  • Use PNG format
  • Same dimensions as reference image
  • Transparent areas: Where the model may change pixels
  • Opaque areas: Where the original must be preserved

Error Handling

Error TypeCause
ValueErrorInvalid aspect ratio, vendor, missing API key, or empty prompt
FileNotFoundErrorPrompt file, reference, or mask image doesn't exist
result["error"]API/network errors (check result["success"])