Pydantic AI Agent Framework
Official Docs: https://ai.pydantic.dev/ Docs Index (for LLMs): https://ai.pydantic.dev/llms.txt
This skill provides quick reference patterns. For the most up-to-date API details, fetch the docs index above and then the specific section you need.
Pydantic AI is a Python agent framework with FastAPI-like ergonomics. It provides type-safe outputs, dependency injection, and seamless integration with multiple LLM providers.
Quick Start
python
from pydantic_ai import Agent
agent = Agent(
'google-gla:gemini-2.5-flash', # Model name
instructions='Be concise, reply with one sentence.',
)
result = agent.run_sync('What is Pydantic AI?')
print(result.output)
Installation
bash
uv add pydantic-ai # Full install uv add "pydantic-ai-slim[google]" # Slim with Google/Gemini only uv add "pydantic-ai-slim[google,dbos]" # With durable execution
Core Concepts
Agent Definition with Structured Output
python
from dataclasses import dataclass
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
@dataclass
class AgentDeps:
"""Dependencies injected into tools and instructions."""
user_id: str
db: DatabaseConn
class AgentOutput(BaseModel):
"""Structured output validated by Pydantic."""
answer: str = Field(description='The answer to the question')
confidence: float = Field(ge=0, le=1)
agent = Agent(
'google-gla:gemini-2.5-flash',
deps_type=AgentDeps,
output_type=AgentOutput,
instructions='You are a helpful assistant.',
)
Tools (Function Calling)
python
@agent.tool
async def search_database(ctx: RunContext[AgentDeps], query: str) -> list[dict]:
"""Search the database for matching records.
Args:
query: The search query to execute.
"""
return await ctx.deps.db.search(query)
@agent.tool_plain # No context needed
def get_current_time() -> str:
"""Get the current time."""
return datetime.now().isoformat()
Dynamic Instructions
python
@agent.instructions
async def add_context(ctx: RunContext[AgentDeps]) -> str:
user = await ctx.deps.db.get_user(ctx.deps.user_id)
return f"The user's name is {user.name}."
Running the Agent
python
# Synchronous
result = agent.run_sync('Hello', deps=deps)
# Async
result = await agent.run('Hello', deps=deps)
# Streaming (see STREAMING.md)
async with agent.run_stream('Hello', deps=deps) as response:
async for chunk in response.stream_output():
print(chunk, end='', flush=True)
# Access results
print(result.output) # AgentOutput instance
print(result.usage()) # Token usage
print(result.new_messages()) # Conversation history
Key Files Reference
- •GEMINI.md - Google/Gemini model configuration and thinking mode
- •TOOLS.md - Tool definitions, validation, and error handling
- •STREAMING.md - FastAPI streaming patterns (SSE, NDJSON)
- •MULTI-AGENT.md - Agent delegation and orchestration patterns
- •DURABLE.md - Durable execution with DBOS/Temporal
Model Names
python
# Google AI Studio (API key)
'google-gla:gemini-2.5-flash'
'google-gla:gemini-2.5-pro'
'google-gla:gemini-3.0-pro' # Latest
# Google Vertex AI (service account)
agent = Agent(GoogleModel('gemini-2.5-flash', provider='google-vertex'))
# Other providers
'openai:gpt-5'
'anthropic:claude-sonnet-4-0'
Environment Variables
bash
GOOGLE_API_KEY=... # Google AI Studio GOOGLE_CLOUD_PROJECT=... # Vertex AI project
Common Patterns
Conversation History
python
# First turn
result1 = await agent.run('Hello')
# Continue conversation
result2 = await agent.run(
'Follow up question',
message_history=result1.new_messages()
)
Usage Limits
python
from pydantic_ai import UsageLimits
result = await agent.run(
prompt,
usage_limits=UsageLimits(
request_limit=10, # Max API calls
tool_calls_limit=5, # Max tool executions
total_tokens_limit=4000 # Max tokens
)
)
Error Handling with Retry
python
from pydantic_ai import ModelRetry
@agent.tool
async def get_user(ctx: RunContext, name: str) -> dict:
user = await db.find_user(name)
if user is None:
raise ModelRetry(f"No user named '{name}'. Try a different name.")
return user
Best Practices
- •Always define
output_typefor structured responses - •Use
deps_typefor dependency injection (DB, HTTP clients, config) - •Pass
usage=ctx.usagewhen calling sub-agents for accurate tracking - •Tool docstrings are sent to the LLM - make them descriptive
- •Use
run_streamfor real-time UX in web applications