Gemini Image Generation (Nano Banana Pro)
Generate and edit images using Google's Gemini API. Use skillx imagegen ... for all CLI operations.
Default Model
| Model | Resolution | Best For |
|---|---|---|
gemini-3-pro-image-preview | 1K-4K | All image generation (default) |
Note: Always use this Pro model. Only use a different model if explicitly requested.
Quick Reference
Default Settings
- •Model:
gemini-3-pro-image-preview - •Resolution: 1K (default, options: 1K, 2K, 4K)
- •Aspect Ratio: 1:1 (default)
Available Aspect Ratios
1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Available Resolutions
1K (default), 2K, 4K
CLI Commands
Ready-to-use commands for common tasks.
Generate Image from Text
skillx imagegen generate "A sunset over mountains" -o sunset.jpg skillx imagegen generate "A minimalist logo" --aspect 1:1 --size 2K -o logo.jpg
Options:
- •
-o, --output: Output file path (default: output.jpg) - •
--aspect: Aspect ratio (1:1, 16:9, 9:16, etc.) - •
--size: Resolution (1K, 2K, 4K)
Edit an Existing Image
skillx imagegen edit input.jpg "Add a sunset to the sky" -o edited.jpg skillx imagegen edit photo.png "Make it look like a watercolor painting" --size 2K
Options:
- •
-o, --output: Output file path (default: edited.jpg) - •
--size: Resolution (1K, 2K, 4K)
Compose Multiple Images
skillx imagegen compose "Create a group photo in an office" person1.jpg person2.jpg -o group.jpg skillx imagegen compose "Combine these into a collage" img1.png img2.png img3.png --aspect 16:9
Options:
- •
-o, --output: Output file path (default: composed.jpg) - •
--aspect: Aspect ratio for output - •
--size: Resolution (1K, 2K, 4K) - •Supports up to 14 input images
Core API Pattern
import os
from google import genai
from google.genai import types
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
# Basic generation (1K, 1:1 - defaults)
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Your prompt here"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
for part in response.parts:
if part.text:
print(part.text)
elif part.inline_data:
image = part.as_image()
image.save("output.png")
Custom Resolution & Aspect Ratio
from google.genai import types
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[prompt],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="16:9", # Wide format
image_size="2K" # Higher resolution
),
)
)
Resolution Examples
# 1K (default) - Fast, good for previews image_config=types.ImageConfig(image_size="1K") # 2K - Balanced quality/speed image_config=types.ImageConfig(image_size="2K") # 4K - Maximum quality, slower image_config=types.ImageConfig(image_size="4K")
Aspect Ratio Examples
# Square (default) image_config=types.ImageConfig(aspect_ratio="1:1") # Landscape wide image_config=types.ImageConfig(aspect_ratio="16:9") # Ultra-wide panoramic image_config=types.ImageConfig(aspect_ratio="21:9") # Portrait image_config=types.ImageConfig(aspect_ratio="9:16") # Photo standard image_config=types.ImageConfig(aspect_ratio="4:3")
Editing Images
Pass existing images with text prompts:
from PIL import Image
img = Image.open("input.png")
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Add a sunset to this scene", img],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
Multi-Turn Refinement
Use chat for iterative editing:
from google.genai import types
chat = client.chats.create(
model="gemini-3-pro-image-preview",
config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)
response = chat.send_message("Create a logo for 'Acme Corp'")
# Save first image...
response = chat.send_message("Make the text bolder and add a blue gradient")
# Save refined image...
Prompting Best Practices
Photorealistic Scenes
Include camera details: lens type, lighting, angle, mood.
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"
Stylized Art
Specify style explicitly:
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"
Text in Images
Be explicit about font style and placement:
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"
Product Mockups
Describe lighting setup and surface:
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"
Advanced Features
Google Search Grounding
Generate images based on real-time data:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Visualize today's weather in Tokyo as an infographic"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
tools=[{"google_search": {}}]
)
)
Multiple Reference Images (Up to 14)
Combine elements from multiple sources:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
"Create a group photo of these people in an office",
Image.open("person1.png"),
Image.open("person2.png"),
Image.open("person3.png"),
],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
Important: File Format & Media Type
CRITICAL: The Gemini API returns images in JPEG format by default. When saving, always use .jpg extension to avoid media type mismatches.
# CORRECT - Use .jpg extension (Gemini returns JPEG)
image.save("output.jpg")
# WRONG - Will cause "Image does not match media type" errors
image.save("output.png") # Creates JPEG with PNG extension!
Converting to PNG (if needed)
If you specifically need PNG format:
from PIL import Image
# Generate with Gemini
for part in response.parts:
if part.inline_data:
img = part.as_image()
# Convert to PNG by saving with explicit format
img.save("output.png", format="PNG")
Verifying Image Format
Check actual format vs extension with the file command:
file image.png # If output shows "JPEG image data" - rename to .jpg!
Notes
- •All generated images include SynthID watermarks
- •Gemini returns JPEG format by default - always use
.jpgextension - •Image-only mode (
responseModalities: ["IMAGE"]) won't work with Google Search grounding - •For editing, describe changes conversationally—the model understands semantic masking
- •Default to 1K resolution for speed; use 2K/4K when quality is critical