Gemini API Skill

Build AI applications with Google's Gemini models and tools.

Quick Start

Installation

bash

# Python
pip install google-genai

# JavaScript/Node.js
npm install @google/genai

# Go
go get google.golang.org/genai

Environment Setup

bash

export GEMINI_API_KEY="your-api-key"

Basic Usage

Python:

python

from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Your prompt here"
)
print(response.text)

JavaScript:

javascript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
    model: "gemini-2.5-flash",
    contents: "Your prompt here"
});
console.log(response.text);

REST:

bash

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"contents": [{"parts": [{"text": "Your prompt here"}]}]}'

Model Selection

Model	Best For	Context Window
Gemini 3 Pro	Most intelligent tasks, multimodal reasoning, agentic	See models-overview
Gemini 2.5 Pro	Complex reasoning, coding, extended thinking	1M tokens
Gemini 2.5 Flash	Balanced performance, general tasks	1M tokens
Gemini 2.5 Flash-Lite	High-volume, cost-sensitive, fastest	See models-overview
Imagen	High-fidelity image generation	N/A
Veo 3.1	Video generation (8s, 720p/1080p with audio)	N/A
Nano Banana	Native image gen with Gemini 2.5 Flash	N/A
Nano Banana Pro	Native image gen with Gemini 3 Pro	N/A

Reference Documentation Index

Getting Started

Topic	File	Description
Setup & Libraries	getting-started.md	API keys, SDK installation, OpenAI compatibility

Models & Pricing

Topic	File	Description
Model Overview	models-overview.md	All models, capabilities, context windows
Pricing	api-pricing.md	Token costs, tool pricing
Rate Limits	rate-limits.md	RPM/TPM limits, quotas
Gemini 3 Guide	gemini-3.md	Gemini 3 specific features and best practices
Imagen	imagen.md	Image generation with Imagen model
Embeddings	embeddings.md	Text embeddings for search/RAG
Veo	veo.md	Video generation with Veo 3.1 (69K)
Lyria	lyria.md	Music generation with Lyria RealTime
Robotics	robotics.md	Gemini Robotics-ER 1.5 (42K)

Core Capabilities

Topic	File	Description
Text Generation	text-generation.md	Text generation, system instructions (38K)
Image Gen (Nano Banana)	image-generation-gemini.md	Native image generation with Gemini (LARGE: 174K)
Image Understanding	image-understanding.md	Vision, image analysis
Video Understanding	video-understanding.md	Video analysis, timestamps
Document Understanding	document-understanding.md	PDF and document processing
Speech Generation	speech-generation.md	Text-to-speech (TTS)
Audio Understanding	audio-understanding.md	Audio analysis, transcription

Advanced Features

Topic	File	Description
Thinking Mode	thinking.md	Extended reasoning capabilities
Thought Signatures	thought-signatures.md	EDGE CASE ONLY: Manual signature handling when NOT using official SDKs
Structured Outputs	structured-outputs.md	JSON schema responses
Function Calling	function-calling.md	Custom tool integration (54K)
Long Context	long-context.md	1M+ token handling, context caching

Tools

Topic	File	Description
Tools Overview	tools-overview.md	Built-in tools summary, agent frameworks
Google Search	google-search.md	Web search grounding
Google Maps	google-maps.md	Location-aware grounding
Code Execution	code-execution.md	Python code execution tool
URL Context	url-context.md	URL content extraction
Computer Use	computer-use.md	Browser automation (preview) (44K)
File Search	file-search.md	RAG with document indexing

Live API (Real-time Streaming)

Topic	File	Description
Getting Started	live-api-getting-started.md	Low-latency voice/video interactions
Capabilities Guide	live-api-capabilities.md	Full capabilities and configurations (32K)
Tool Use	live-api-tools.md	Function calling & Search in Live API
Session Management	live-api-sessions.md	Session handling, time limits
Ephemeral Tokens	ephemeral-tokens.md	Short-lived auth for client-side WebSockets

Guides

Topic	File	Description
Batch API	batch-api.md	Async processing at 50% cost (47K)
Files API	files-api.md	Upload and manage media files (49K)
Context Caching	context-caching.md	Implicit & explicit caching for cost savings
Media Resolution	media-resolution.md	Control token allocation for media
Tokens	tokens.md	Understand and count tokens
Prompt Design	prompt-design.md	Prompt strategies and best practices (47K)
Logs & Datasets	logs-datasets.md	Enable logging, view in AI Studio
Data Logging & Sharing	data-logging-sharing.md	Storage and management of API logs
Safety Settings	safety-settings.md	Adjust safety filters
Safety Guidance	safety-guidance.md	Best practices for safe AI use

Troubleshooting & Migration

Topic	File	Description
Troubleshooting	troubleshooting.md	Diagnose and resolve common API issues (25K)
Vertex AI Comparison	vertex-ai-comparison.md	READ ONLY IF USER MENTIONS "VERTEX AI": Gemini Developer API vs Vertex AI differences

Large Files - Search Patterns

For large reference files (>30K), use grep to find specific sections:

image-generation-gemini.md (174K):

bash

grep -n "## " references/image-generation-gemini.md  # List sections
grep -n "edit" references/image-generation-gemini.md  # Find editing info
grep -n "style" references/image-generation-gemini.md  # Find style transfer

veo.md (69K):

bash

grep -n "## " references/veo.md  # List sections
grep -n "audio" references/veo.md  # Find audio generation info

models-overview.md (67K):

bash

grep -n "gemini-3" references/models-overview.md
grep -n "context" references/models-overview.md

function-calling.md (54K):

bash

grep -n "## " references/function-calling.md
grep -n "parallel" references/function-calling.md  # Parallel function calls

Common Patterns

Multimodal Input (Image + Text)

python

from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents=[
        types.Part.from_image(image_path),
        types.Part.from_text("Describe this image")
    ]
)

Function Calling

python

tools = [
    types.Tool(function_declarations=[{
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string"}},
            "required": ["location"]
        }
    }])
]

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What's the weather in Paris?",
    config=types.GenerateContentConfig(tools=tools)
)

Google Search Grounding

python

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What are the latest AI developments?",
    config=types.GenerateContentConfig(
        tools=[types.Tool(google_search=types.GoogleSearch())]
    )
)

Thinking Mode

python

response = client.models.generate_content(
    model="gemini-2.5-pro",
    contents="Solve this complex problem...",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_budget_tokens=10000)
    )
)

Streaming

python

for chunk in client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="Write a story"
):
    print(chunk.text, end="")

Key Concepts

Tool Execution Flow

Built-in tools (Google Search, Code Execution): Executed by Google

•Send prompt with tool config → Model executes tool → Response with grounded results

Custom tools (Function Calling): You execute

•Send prompt with function declarations → Model returns function call JSON
•You execute function, send result back → Model generates final response

Thought Signatures (Important)

•If using official SDKs with chat feature: Thought signatures are handled automatically. No action needed.
•If manually managing conversation history: Read thought-signatures.md for Gemini 3 Pro function calling requirements.

API Endpoints

Endpoint	Purpose
`/v1beta/models/{model}:generateContent`	Standard generation
`/v1beta/models/{model}:streamGenerateContent`	Streaming
`/v1beta/models/{model}:embedContent`	Embeddings
`/v1beta/models/{model}:countTokens`	Token counting

Base URL: https://generativelanguage.googleapis.com