AgentSkillsCN

baoyu-image-gen

借助 OpenAI 和 Google API 进行 AI 图像生成。支持文本到图像、参考图像、不同宽高比,以及并行生成(推荐同时运行 4 个子代理)。适用场景:当用户要求生成、创作或绘制图像时。

SKILL.md
--- frontmatter
name: baoyu-image-gen
description: AI image generation with OpenAI and Google APIs. Supports text-to-image, reference images, aspect ratios, and parallel generation (recommended 4 concurrent subagents). Use when user asks to generate, create, or draw images.

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI and Google providers.

Script Directory

Agent Execution:

  1. SKILL_DIR = this SKILL.md file's directory
  2. Script path = ${SKILL_DIR}/scripts/main.ts

Preferences (EXTEND.md)

Use Bash to check EXTEND.md existence (priority order):

bash
# Check project-level first
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"

# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

┌──────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├──────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │ ├──────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │ └──────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ Use defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio

Usage

bash
# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal only)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

Options

OptionDescription
--prompt <text>, -pPrompt text
--promptfiles <files...>Read prompt from files (concatenated)
--image <path>Output image path (required)
--provider google|openaiForce provider (default: google)
--model <id>, -mModel ID
--ar <ratio>Aspect ratio (e.g., 16:9, 1:1, 4:3)
--size <WxH>Size (e.g., 1024x1024)
--quality normal|2kQuality preset (default: 2k)
--imageSize 1K|2K|4KImage size for Google (default: from quality)
--ref <files...>Reference images (Google multimodal only)
--n <count>Number of images
--jsonJSON output

Environment Variables

VariableDescription
OPENAI_API_KEYOpenAI API key
GOOGLE_API_KEYGoogle API key
OPENAI_IMAGE_MODELOpenAI model override
GOOGLE_IMAGE_MODELGoogle model override
OPENAI_BASE_URLCustom OpenAI endpoint
GOOGLE_BASE_URLCustom Google endpoint

Load Priority: CLI args > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Provider Selection

  1. --provider specified → use it
  2. Only one API key available → use that provider
  3. Both available → default to Google

Quality Presets

PresetGoogle imageSizeOpenAI SizeUse Case
normal1K1024pxQuick previews
2k (default)2K2048pxCovers, illustrations, infographics

Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

  • Google multimodal: uses imageConfig.aspectRatio
  • Google Imagen: uses aspectRatio parameter
  • OpenAI: maps to closest supported size

Parallel Generation

Supports concurrent image generation via background subagents for batch operations.

SettingValue
Recommended concurrency4 subagents
Max concurrency8 subagents
Use caseBatch generation (slides, comics, infographics)

Agent Implementation:

code
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete

Best Practice: When generating 4+ images, spawn background subagents (recommended 4 concurrent) instead of sequential execution.

Error Handling

  • Missing API key → error with setup instructions
  • Generation failure → auto-retry once
  • Invalid aspect ratio → warning, proceed with default
  • Reference images with non-multimodal model → warning, ignore refs

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.