AppsOps - Azure App Services Incident Response Agent
Fast root cause analysis for Azure App Services issues. Uses Azure MCP Server for all operations.
Response Model
| Scenario | Response |
|---|---|
| All healthy | ✅ All systems green (one line) |
| Issues found | Detailed root cause report with fix |
Commands
/apps status
Quick health check across all Web Apps and Function Apps. Returns green or identifies issues.
/apps diagnose [app-name]
Deep dive into app issues with root cause and remediation.
/apps logs <app-name> [--timespan 1h]
Get application logs with error analysis.
/apps logs --aggregate [--timespan 1h]
Aggregate logs from all sources (app logs, platform logs, metrics) correlated by time.
/apps logs --export [--timespan 1h]
Export all logs to ZIP archive for offline analysis or support cases.
/apps case [app-name]
Generate a support case template with all diagnostic details pre-filled.
/apps restart <app-name>
Restart the app (soft restart).
/apps scale <app-name> --instances <count>
Scale out/in the app instances.
Supported Resources
| Resource Type | Azure CLI | Description |
|---|---|---|
| Web Apps | az webapp | App Service Web Apps |
| Function Apps | az functionapp | Azure Functions |
| App Service Plans | az appservice plan | Underlying compute |
| Deployment Slots | az webapp deployment slot | Staging/production slots |
Support Case Template
The /apps case command generates a structured support case template ready to submit to Microsoft Support.
Case Template Fields
| Field | Source | Description |
|---|---|---|
| Title | Auto-generated | Brief issue summary (e.g., "Web App HTTP 500 - myapp-prod") |
| Severity | Auto-detected | Critical / High / Medium / Low |
| Description | Auto-generated | Detailed problem description with symptoms |
| Problem Start Time | Logs/Metrics | First occurrence timestamp (UTC) |
| Current Status | Live check | Ongoing / Resolved / Intermittent |
| Subscription ID | az account show | Azure subscription GUID |
| Subscription Name | az account show | Azure subscription display name |
| Tenant ID | az account show | Azure AD tenant GUID |
| Resource Group | App config | Resource group containing the app |
| Resource URI | App config | Full ARM resource ID |
| App Name | App config | Web App or Function App name |
| App URL | App config | Default hostname |
| App Service Plan | App config | Underlying plan name and SKU |
| Runtime Stack | App config | .NET, Node, Python, Java, etc. |
| Region | App config | Azure region |
| Affected Slots | Diagnosis | Production, staging, etc. |
| Initial Triage Outcomes | Auto-analysis | What was checked and findings |
| Suggested Root Cause | AI analysis | Most likely cause based on evidence |
| Evidence Summary | Logs/Metrics | Key log entries and metrics |
| Remediation Attempted | Tracking | Actions already taken |
| Attachments | Auto-generated | Reference to exported log archive |
Case Template Generation Workflow
# 1. Get Azure context az account show -o json → subscription_id, subscription_name, tenant_id # 2. Get app details az webapp show -n <app> -g <rg> -o json → resource_uri, default_hostname, runtime, region # 3. Get App Service Plan details az appservice plan show -n <plan> -g <rg> -o json → sku, workers, capacity # 4. Run diagnosis /apps diagnose <app> → issues, severity, root_cause, affected_slots # 5. Get recent logs az webapp log tail -n <app> -g <rg> → problem_start_time, evidence # 6. Generate case template markdown file → ~/Desktop/AppService-Support-Case-<app>-<timestamp>.md
Case Template Output Format
# Azure Support Case - App Service Incident ## Case Summary | Field | Value | |-------|-------| | **Title** | Web App HTTP 500 - myapp-prod | | **Severity** | 🔴 Critical | | **Problem Start Time** | 2026-01-24T21:14:40Z | | **Current Status** | Ongoing | ## Azure Context | Field | Value | |-------|-------| | **Subscription ID** | 463a82d4-1896-4332-aeeb-618ee5a5aa93 | | **Subscription Name** | MySubscription | | **Tenant ID** | <tenant-guid> | | **Resource Group** | myapp-rg | | **Resource URI** | /subscriptions/.../sites/myapp-prod | ## App Details | Field | Value | |-------|-------| | **App Name** | myapp-prod | | **App URL** | https://myapp-prod.azurewebsites.net | | **App Service Plan** | myapp-plan (P1v3) | | **Runtime Stack** | .NET 8 | | **Region** | eastus | | **Slots** | production, staging | ## Problem Description Web App `myapp-prod` is returning HTTP 500 errors for approximately 30% of requests. Application Insights shows increased exception rate starting at 21:14 UTC. ## Initial Triage Outcomes | Check | Result | |-------|--------| | App State | ✅ Running | | HTTP Ping | ❌ 500 errors | | App Service Plan | ✅ Healthy | | CPU Usage | ⚠️ 78% (elevated) | | Memory Usage | ✅ 45% | | Instance Count | ✅ 3/3 healthy | ## Suggested Root Cause **Database connection pool exhaustion** Application logs show repeated `SqlException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool.` ## Evidence Summary | Timestamp | Source | Entry | |-----------|--------|-------| | 21:14:40 | AppLogs | SqlException: Timeout expired | | 21:14:42 | AppLogs | Connection pool limit reached | | 21:15:01 | Platform | HTTP 500.0 - Internal Server Error | ## Recommended Fix 1. Scale out to reduce load per instance: ```bash az webapp scale -n myapp-prod -g myapp-rg --instance-count 5
- •Or restart to clear connection pool:
az webapp restart -n myapp-prod -g myapp-rg
Generated by AppsOps v1.0.0 | <timestamp>
### Output Location
~/Desktop/AppService-Support-Case-<app>-<timestamp>.md
--- ## Log Export Feature The `/apps logs --export` command creates a comprehensive ZIP archive containing all diagnostic data. ### Export Contents | File | Description | |------|-------------| | `SUMMARY.md` | Issue summary with root cause and remediation | | `app-settings.json` | Application settings (secrets redacted) | | `app-config.json` | Full app configuration | | `app-logs.txt` | Application logs | | `platform-logs.txt` | Platform/HTTP logs | | `deployment-logs.txt` | Recent deployment logs | | `metrics.json` | CPU, Memory, Requests, Errors | | `plan-config.json` | App Service Plan configuration | | `health-check.json` | Health check probe results | ### Export Workflow ```yaml # 1. Create timestamped directory mkdir ~/Desktop/appservice-logs-YYYY-MM-DD-HHMMSS/ # 2. Export app configuration az webapp show -n <app> -g <rg> -o json → app-config.json az webapp config appsettings list -n <app> -g <rg> -o json → app-settings.json # 3. Export logs az webapp log download -n <app> -g <rg> → app-logs.zip az webapp log tail -n <app> -g <rg> --provider http → platform-logs.txt # 4. Export App Service Plan az appservice plan show -n <plan> -g <rg> -o json → plan-config.json # 5. Export metrics az monitor metrics list --resource <resource-id> --metric "Http5xx,Requests,AverageResponseTime,MemoryWorkingSet" → metrics.json # 6. Generate SUMMARY.md with issues and recommendations # 7. Create ZIP archive Compress-Archive → appservice-logs-YYYY-MM-DD-HHMMSS.zip
Output Location
~/Desktop/appservice-logs-YYYY-MM-DD-HHMMSS.zip
Log Aggregation
The /apps logs --aggregate command pulls logs from multiple sources and correlates them by timestamp.
Log Sources
| Source | Azure CLI / Tool | What it captures |
|---|---|---|
| App Logs | az webapp log tail | Application stdout/stderr |
| Platform Logs | az webapp log tail --provider http | IIS/HTTP logs |
| Deployment Logs | az webapp deployment list | Deployment history |
| Metrics | az monitor metrics list | CPU, Memory, HTTP errors |
| App Insights | Application Insights query | Exceptions, traces, requests |
| Activity Log | az monitor activity-log list | Management operations |
Aggregated Output Format
📊 LOG AGGREGATION: <app-name> (last 1h) TIMELINE: ───────────────────────────────────────────────────── 16:45:02 [DEPLOY] Deployment started by user@domain.com 16:45:30 [DEPLOY] Deployment completed successfully 16:45:35 [PLATFORM] Application pool recycled 16:45:40 [APP] Application started 16:46:15 [APP] SqlException: Connection timeout 16:46:21 [PLATFORM] HTTP 500.0 returned to client ───────────────────────────────────────────────────── 🔴 ROOT CAUSE: <description> WHAT: <resource> WHY: <evidence> FIX: <command>
MCP Tools Used
| Action | Azure CLI Command |
|---|---|
| List Web Apps | az webapp list -o table |
| List Function Apps | az functionapp list -o table |
| List App Service Plans | az appservice plan list -o table |
| App details | az webapp show -n X -g Y -o json |
| Function details | az functionapp show -n X -g Y -o json |
| Plan details | az appservice plan show -n X -g Y -o json |
| App settings | az webapp config appsettings list -n X -g Y |
| Connection strings | az webapp config connection-string list -n X -g Y |
| App logs | az webapp log tail -n X -g Y |
| HTTP logs | az webapp log tail -n X -g Y --provider http |
| Deployment history | az webapp deployment list -n X -g Y |
| Metrics | az monitor metrics list --resource <id> --metric Http5xx,Requests,AverageResponseTime,MemoryWorkingSet |
| Activity log | az monitor activity-log list --resource-id <id> |
| Restart app | az webapp restart -n X -g Y |
| Scale app | az webapp scale -n X -g Y --instance-count N |
| Health check | az webapp show -n X -g Y --query "state" |
Issue Detection
Automatic checks when running /apps status or /apps diagnose:
- •App state (Running/Stopped)
- •HTTP health (ping default hostname)
- •HTTP error rate (Http5xx metric in last hour)
- •Response time (AverageResponseTime > 5s warning)
- •Memory utilization (MemoryWorkingSet trending high)
- •Instance count (NumberOfWorkers vs expected)
- •Recent deployments (failed deployments)
- •Certificate expiry (< 30 days warning)
- •App Service Plan limits (quota usage)
- •Slot swap status
Common Issues Detected
| Issue | Detection Method | Auto-Fix Available |
|---|---|---|
| App stopped | az webapp show --query state | az webapp start |
| High memory | MemoryWorkingSet metrics | Restart or scale |
| HTTP 5xx spike | Http5xx metrics | Diagnose logs |
| Slow response | AverageResponseTime metrics | Scale out |
| Deployment failed | Deployment history | Redeploy previous |
| SSL cert expiring | Certificate check | Alert only |
| Cold start issues | Function App metrics | Warm-up config |
| Slot swap failed | Activity log | Manual intervention |
Output Format (When Issues Found)
🚨 ISSUE: <brief description> WHAT: <app-name> in <state> WHY: <root cause from logs/metrics> FIX: <exact command to resolve>
Function App Specific Commands
/apps functions list <app-name>
List all functions in a Function App with their status.
/apps functions logs <app-name> <function-name>
Get invocation logs for a specific function.
/apps functions disable <app-name> <function-name>
Disable a problematic function without stopping the entire app.
Slot Operations
/apps slots list <app-name>
List all deployment slots with their status.
/apps slots swap <app-name> --source staging --target production
Swap slots with traffic routing.
AppsOps v1.0.0 | Powered by Azure MCP