Debugging and Observability with Microsoft Agent Framework

This skill helps you debug AI agents and set up observability using DevUI and OpenTelemetry integration.

When to Use This Skill

•Debugging agent behavior issues
•Setting up distributed tracing with OpenTelemetry
•Monitoring agent performance and latency
•Analyzing agent conversation flows
•Using DevUI for interactive testing

DevUI - Interactive Developer Interface

DevUI is an interactive web-based interface for agent development, testing, and debugging.

Installation

bash

pip install agent-framework[devui] --pre

Starting DevUI

python

from agent_framework.devui import DevUI
from agent_framework.azure import AzureOpenAIResponsesClient
from azure.identity import AzureCliCredential

# Create your agent
agent = AzureOpenAIResponsesClient(
    credential=AzureCliCredential()
).as_agent(
    name="DebugAgent",
    instructions="You are a helpful assistant."
)

# Launch DevUI
devui = DevUI()
devui.register_agent(agent)
devui.start(port=8080)
# Open http://localhost:8080 in your browser

DevUI Features

•Interactive chat: Test agents in real-time
•Conversation history: View and replay past conversations
•Tool execution viewer: See tool calls and responses
•Token usage tracking: Monitor API consumption
•Workflow visualization: Visualize multi-agent workflows
•Time-travel debugging: Step through workflow execution

OpenTelemetry Integration

Microsoft Agent Framework has built-in OpenTelemetry support for distributed tracing and monitoring.

Python - Basic Setup

python

import asyncio
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor

from agent_framework.azure import AzureOpenAIResponsesClient
from agent_framework.observability import configure_telemetry
from azure.identity import AzureCliCredential

# Configure OpenTelemetry
provider = TracerProvider()
processor = SimpleSpanProcessor(ConsoleSpanExporter())
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

# Instrument HTTP client
HTTPXClientInstrumentor().instrument()

# Enable Agent Framework telemetry
configure_telemetry(enable_tracing=True)

async def main():
    agent = AzureOpenAIResponsesClient(
        credential=AzureCliCredential()
    ).as_agent(
        name="TracedAgent",
        instructions="You are a helpful assistant."
    )
    
    # All agent calls will now be traced
    response = await agent.run("Hello!")
    print(response)

if __name__ == "__main__":
    asyncio.run(main())

Python - Export to Azure Monitor

python

from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

# Configure Azure Monitor exporter
exporter = AzureMonitorTraceExporter(
    connection_string="InstrumentationKey=your-key;..."
)

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)

# Continue with agent setup...

Python - Export to Jaeger

python

from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

exporter = JaegerExporter(
    agent_host_name="localhost",
    agent_port=6831,
)

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)

# Continue with agent setup...

.NET - Basic Setup

csharp

using OpenTelemetry;
using OpenTelemetry.Trace;
using Microsoft.Agents.AI.Telemetry;

// Configure OpenTelemetry
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
    .AddAgentFrameworkInstrumentation()
    .AddConsoleExporter()
    .Build();

// All agent calls will be traced
var agent = client.GetOpenAIResponseClient("gpt-4o-mini")
    .AsAIAgent(name: "TracedAgent", instructions: "You are helpful.");

var response = await agent.RunAsync("Hello!");
Console.WriteLine(response);

.NET - Export to Azure Monitor

csharp

using Azure.Monitor.OpenTelemetry.Exporter;

using var tracerProvider = Sdk.CreateTracerProviderBuilder()
    .AddAgentFrameworkInstrumentation()
    .AddAzureMonitorTraceExporter(options =>
    {
        options.ConnectionString = "InstrumentationKey=...";
    })
    .Build();

Debugging Techniques

1. Enable Verbose Logging

python

import logging

# Set logging level for agent framework
logging.basicConfig(level=logging.DEBUG)
logging.getLogger("agent_framework").setLevel(logging.DEBUG)

2. Capture Diagnostic Information

python

async def debug_agent_call(agent, prompt):
    try:
        response = await agent.run(prompt)
        return response
    except Exception as e:
        # Log diagnostic information
        print(f"Error: {e}")
        print(f"Diagnostics: {agent.get_diagnostics()}")
        raise

3. Inspect Tool Calls

python

from agent_framework.middleware import LoggingMiddleware

# Add middleware to log all tool calls
agent = client.as_agent(
    name="DebugAgent",
    instructions="...",
    middleware=[LoggingMiddleware(log_tool_calls=True)]
)

4. Track Token Usage

python

async def main():
    agent = client.as_agent(name="Agent", instructions="...")
    
    response = await agent.run("Complex prompt...")
    
    # Access usage statistics
    usage = agent.last_usage
    print(f"Prompt tokens: {usage.prompt_tokens}")
    print(f"Completion tokens: {usage.completion_tokens}")
    print(f"Total tokens: {usage.total_tokens}")

5. Time-Travel Debugging for Workflows

python

from agent_framework.workflows import Workflow
from agent_framework.devui import DevUI

workflow = Workflow("my-workflow")
# ... configure workflow ...

# Enable time-travel debugging
devui = DevUI()
devui.register_workflow(workflow)
devui.enable_time_travel()
devui.start()

# Run workflow - you can step through execution in DevUI
await workflow.run(input_data)

Common Issues and Solutions

Issue: Agent produces inconsistent results

Solution: Add temperature control and seed for reproducibility:

python

agent = client.as_agent(
    name="ConsistentAgent",
    instructions="...",
    temperature=0.0,  # Deterministic
    seed=42           # Reproducible
)

Issue: Tool calls failing silently

Solution: Add error handling middleware:

python

from agent_framework.middleware import ErrorHandlingMiddleware

agent = client.as_agent(
    name="RobustAgent",
    middleware=[ErrorHandlingMiddleware(
        on_error="retry",
        max_retries=3,
        log_errors=True
    )]
)

Issue: High latency

Solution: Use streaming and check trace spans:

python

# Enable streaming for faster first-token response
async for chunk in agent.stream("Long prompt..."):
    print(chunk, end="", flush=True)

# Check OpenTelemetry spans for latency breakdown

Issue: Rate limiting (429 errors)

Solution: Implement retry with exponential backoff:

python

from agent_framework.middleware import RetryMiddleware

agent = client.as_agent(
    name="RetryAgent",
    middleware=[RetryMiddleware(
        max_retries=5,
        initial_delay=1.0,
        exponential_backoff=True
    )]
)

Metrics to Monitor

Metric	Description	Alert Threshold
`agent.request.duration`	Time per request	> 10s
`agent.token.usage`	Tokens per request	> 4000
`agent.tool.calls`	Tool invocations	> 10 per request
`agent.error.rate`	Percentage of failures	> 5%
`workflow.step.duration`	Time per workflow step	> 30s

Best Practices

•Always enable tracing in production - Essential for debugging
•Use structured logging - Include correlation IDs
•Set up alerts - Monitor error rates and latency
•Use DevUI in development - Faster iteration
•Export traces to centralized system - Azure Monitor, Jaeger, etc.
•Add custom spans - For business-specific metrics

agent-debugging

Debugging and Observability with Microsoft Agent Framework

When to Use This Skill

DevUI - Interactive Developer Interface

Installation

Starting DevUI

DevUI Features

OpenTelemetry Integration

Python - Basic Setup

Python - Export to Azure Monitor

Python - Export to Jaeger

.NET - Basic Setup

.NET - Export to Azure Monitor

Debugging Techniques

1. Enable Verbose Logging

2. Capture Diagnostic Information

3. Inspect Tool Calls

4. Track Token Usage

5. Time-Travel Debugging for Workflows

Common Issues and Solutions

Issue: Agent produces inconsistent results

Issue: Tool calls failing silently

Issue: High latency

Issue: Rate limiting (429 errors)

Metrics to Monitor

Best Practices

References