Gemini Image Generation
Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.
When to Use This Skill
Use this skill when you need to:
- •Create visual content from text descriptions
- •Generate multiple image variations
- •Create images at specific resolutions (1K, 2K, 4K)
- •Produce images for different aspect ratios (social media, banners, etc.)
- •Generate photorealistic images or artistic visuals
- •Create images with person generation controls
- •Batch generate multiple images at once
- •Combine with text generation for complete content creation
Available Scripts
scripts/generate_image.py
Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models
When to use:
- •Any image generation task
- •Multiple image generation (1-4 per request)
- •Custom resolution and aspect ratio needs
- •Professional asset creation
- •Photorealistic or artistic image generation
Key parameters:
| Parameter | Description | Example |
|---|---|---|
prompt | Text description (required) | "A futuristic city at sunset" |
--model, -m | Model to use | gemini-3-pro-image-preview |
--output-dir, -o | Output directory for images | images/ |
--name, -n | Base name for output files | artwork |
--no-timestamp | Disable auto timestamp | Flag |
--aspect, -a | Aspect ratio | 16:9 |
--size, -s | Resolution | 2K or 4K |
--num | Number of images (1-4) | 4 |
--person | Person generation policy | allow_adult |
Output: List of saved PNG file paths
Workflows
Workflow 1: Basic Image Generation
bash
python scripts/generate_image.py "A futuristic city at sunset with flying cars"
- •Best for: Quick image generation, prototypes
- •Model:
gemini-3-pro-image-preview(default, highest quality) - •Output:
images/generated_image_YYYYMMDD_HHMMSS.png
Workflow 2: Social Media (Instagram, Facebook)
bash
python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
- •Best for: Instagram posts, profile pictures
- •Aspect: 1:1 (square format)
- •Resolution: 2K (2048x2048)
- •Output:
images/coffee-shop_YYYYMMDD_HHMMSS.png
Workflow 3: YouTube Thumbnails (16:9)
bash
python scripts/generate_image.py "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
- •Best for: YouTube, video thumbnails
- •Aspect: 16:9 (widescreen)
- •Resolution: 2K (2752x1536)
- •Output:
images/thumbnail_YYYYMMDD_HHMMSS.png
Workflow 4: Multiple Variations
bash
python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract
- •Best for: A/B testing, design options
- •Generates: 4 distinct variations
- •Output:
images/abstract_YYYYMMDD_HHMMSS_0.png,images/abstract_YYYYMMDD_HHMMSS_1.png, etc.
Workflow 5: Custom Output Directory
bash
python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum
- •Best for: Print materials, high-end assets, organized projects
- •Model:
gemini-3-pro-image-previewonly (for 4K) - •Resolution: 4K (5504x3072 for 16:9)
- •Directory created automatically if it doesn't exist
Workflow 6: Photorealistic Images (Imagen 4)
bash
python scripts/generate_image.py "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate
- •Best for: Realistic photos, product shots
- •Model:
imagen-4.0-generate-001(photorealistic) - •Notes: English prompts only
- •Max 4 images per request
Workflow 7: Blog Post Featured Image
bash
python scripts/generate_image.py "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image
- •Best for: Blog headers, article images
- •Combines well with: gemini-text for blog content generation
Workflow 8: Content Creation Pipeline (Text + Image)
bash
# 1. Generate content (gemini-text skill) python skills/gemini-text/scripts/generate.py "Write a product description for smart home device" # 2. Generate product image (this skill) python scripts/generate_image.py "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product # 3. Create social media post
- •Best for: E-commerce, marketing campaigns
- •Combines with: gemini-text, gemini-batch for batch production
Workflow 9: Disable Timestamp
bash
python scripts/generate_image.py "Fixed filename image" --name my-image --no-timestamp
- •Best for: When you want complete control over filename
- •Output:
images/my-image.png(no timestamp) - •Use when: Generating files for specific naming schemes or automated pipelines
Parameters Reference
Model Selection
| Model | Nickname | Quality | Max Size | Best For |
|---|---|---|---|---|
gemini-3-pro-image-preview | Nano Banana Pro | Highest | 4K | Professional assets, advanced text rendering |
gemini-2.5-flash-image | Nano Banana | Good | 2K | High-volume, low-latency |
imagen-4.0-generate-001 | Imagen 4 | Photorealistic | 2K | Realistic photos, product shots |
Aspect Ratios
| Ratio | Use Case | 1K Size | 2K Size |
|---|---|---|---|
| 1:1 | Instagram, avatars | 1024x1024 | 2048x2048 |
| 16:9 | YouTube, presentations | 1376x768 | 2752x1536 |
| 9:16 | Instagram Stories, TikTok | 768x1376 | 1536x2752 |
| 4:3 | Traditional displays | 1024x768 | 2048x1536 |
| 3:4 | Portrait orientation | 768x1024 | 1536x2048 |
| 21:9 | Ultrawide | - | 5504x2400 |
Note: 4K resolution only available with gemini-3-pro-image-preview
Resolution Guide
| Size | Use Case | Best Model |
|---|---|---|
| 1K (1024px) | Web thumbnails, previews | Any model |
| 2K (2048px) | Standard web, social media | Any model |
| 4K (4096px) | Print, high-end assets | gemini-3-pro only |
Person Generation Policy
| Policy | Description | Restrictions |
|---|---|---|
dont_allow | No people in images | None |
allow_adult | Adults only | Recommended default |
allow_all | All ages | Restricted in EU, UK, CH, MENA |
Output Interpretation
File Naming
- •Default format:
{name}_YYYYMMDD_HHMMSS.png(auto timestamp) - •Single image example:
artwork_20260130_031643.png - •Multiple images:
{name}_YYYYMMDD_HHMMSS_0.png,{name}_YYYYMMDD_HHMMSS_1.png, etc. - •Without timestamp (
--no-timestamp):{name}.png - •Script prints: "Saved: /path/to/file.png"
Image Quality
- •All images include SynthID watermark for authenticity
- •PNG format for lossless quality
- •Can be converted to JPEG/WEBP if needed
- •4K images are significantly larger file sizes
Error Messages
- •"Model not available": Check model name spelling
- •"Unsupported size": Verify size/model combination
- •"Aspect ratio error": Use supported ratios for selected model
Common Issues
"google-genai or pillow not installed"
bash
pip install google-genai pillow
"Image generation failed"
- •Check prompt length (too verbose can fail)
- •Try simpler, more focused prompts
- •Verify model availability in your region
- •Check API quota limits
"Unsupported aspect ratio"
- •Check if ratio is supported by selected model
- •Imagen 4 has fewer ratio options than Gemini
- •Use 16:9 or 1:1 for best compatibility
"4K not supported"
- •4K only works with
gemini-3-pro-image-preview - •Use
--size 2Kfor other models - •Try
--model gemini-3-pro-image-preview --size 4K
"Imagen prompt language error"
- •Imagen models support English prompts only
- •Use
gemini-3-pro-image-previewfor other languages - •Translate prompt to English for Imagen
File too large for storage
- •Use
--size 1Kfor smaller files - •Compress images after generation
- •Convert PNG to JPEG for web use
Best Practices
Prompt Engineering
- •Be specific and descriptive
- •Include style descriptors (e.g., "photorealistic", "digital art")
- •Mention lighting, mood, and composition
- •Use analogies for complex concepts
- •Avoid negative prompts (describe what you want, not what to avoid)
Model Selection
- •Use
gemini-3-pro-image-previewfor: High quality, text rendering, 4K - •Use
gemini-2.5-flash-imagefor: Speed, high volume - •Use
imagen-4.0-generate-001for: Photorealism, product shots
Performance Optimization
- •Generate multiple images at once with
--num - •Use lower resolution for previews
- •Batch requests for high-volume needs (gemini-batch skill)
- •Cache results for repeated requests
Quality Tips
- •Use 2K resolution for most web uses
- •4K only when maximum detail is needed
- •Combine specific prompts with style guidance
- •Test prompts with
--num 1before generating batches
Cost Management
- •Use flash models for cost efficiency
- •4K generation costs significantly more
- •Batch multiple requests when possible
- •Generate at 1K for testing, 2K/4K for final
Related Skills
- •gemini-text: Generate text content alongside images
- •gemini-tts: Create audio for image-based content
- •gemini-batch: Process multiple image requests efficiently
- •gemini-embeddings: Generate image embeddings for similarity search
Quick Reference
bash
# Basic python scripts/generate_image.py "Your prompt" # Social media (1:1) python scripts/generate_image.py "Prompt" --aspect 1:1 --size 2K --name social-post # YouTube thumbnail (16:9) python scripts/generate_image.py "Prompt" --aspect 16:9 --size 2K --name thumbnail # 4K high quality python scripts/generate_image.py "Prompt" --aspect 16:9 --size 4K --name high-res # Multiple variations python scripts/generate_image.py "Prompt" --num 4 --name variations # Custom directory python scripts/generate_image.py "Prompt" --output-dir ./my-images/ --name custom # Photorealistic python scripts/generate_image.py "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo # No timestamp python scripts/generate_image.py "Prompt" --name fixed-name --no-timestamp
Reference
- •See
references/for model documentation (if available) - •Get API key: https://aistudio.google.com/apikey
- •Documentation: https://ai.google.dev/gemini-api/docs/image-generation
- •SynthID: https://deepmind.google/technologies/synthid/