AgentSkillsCN

ada-issue-diagnosis

诊断Ada AI智能体的性能问题,探究其各项指标的突发变化。当用户发现CSAT或AR出现下滑、询问“为何会出现X情况”、希望深入调查某一具体问题,或需要厘清性能变化的根本原因时,可使用此技能。

SKILL.md
--- frontmatter
name: ada-issue-diagnosis
description: Diagnose performance issues and investigate sudden changes in Ada AI agent metrics. Use when the user notices a drop in CSAT or AR, asks "why did X happen", wants to investigate a specific problem, or needs to understand root causes of performance changes.
license: Apache-2.0
compatibility: Requires Ada MCP server connected to Claude Desktop or compatible MCP client
metadata:
  author: ada
  version: "1.0"
allowed-tools: get_ada_metric get_available_filters get_conversations_by_filters get_conversation get_ada_configuration search_knowledge search_coaching

Diagnosing Ada Performance Issues

When to use this skill

Use this skill when the user wants to:

  • Investigate why CSAT or AR dropped
  • Understand a sudden change in metrics
  • Diagnose a specific problem ("customers are complaining about X")
  • Find root causes of performance issues
  • Analyze what went wrong in a specific timeframe

Diagnostic workflow

Step 1: Confirm the issue

First, verify and quantify the problem:

code
Use get_ada_metric to compare:
- Problem period (e.g., yesterday, this week)
- Baseline period (e.g., previous day, last week)

Quantify:

  • How big is the change?
  • When did it start?
  • Is it ongoing or resolved?

Step 2: Isolate the problem area

Narrow down where the issue is occurring:

code
Use get_available_filters to understand filtering options
Use get_conversations_by_filters with various filters:
- By CSAT score (if CSAT issue)
- By resolution status (if AR issue)
- By handoff status
- By specific date ranges

Look for:

  • Is the issue across all conversations or specific segments?
  • Does it correlate with specific topics, channels, or times?

Step 3: Analyze affected conversations

Deep dive into problematic conversations:

code
Use get_conversation on 15-25 conversations from the problem period

Compare to baseline:

code
Use get_conversation on 10-15 conversations from normal performance period

Identify differences:

  • New types of inquiries?
  • Different customer behavior?
  • Agent responding differently?

Step 4: Check for configuration changes

Review what might have changed:

code
Use get_ada_configuration to review:
- Playbooks
- Guidance/custom instructions
- Actions
- Coaching rules

Ask the user:

  • Were any changes made recently?
  • New deployments or updates?
  • Changes to integrations or backend systems?

Step 5: Look for external factors

Consider non-configuration causes:

  • Seasonal patterns or events
  • Marketing campaigns driving new traffic
  • Product launches or changes
  • External events affecting customer behavior

Step 6: Root cause synthesis

Combine findings into a diagnosis:

markdown
## Diagnosis

**Issue**: [Specific problem observed]
**Timeframe**: [When it started/occurred]
**Magnitude**: [How bad - percentage change, number affected]

**Root Cause**: [Most likely explanation]

**Evidence**:
1. [Supporting finding 1]
2. [Supporting finding 2]
3. [Supporting finding 3]

**Contributing Factors**:
- [Additional factor if applicable]

Step 7: Provide remediation steps

Based on root cause, recommend fixes:

markdown
## Recommended Actions

### Immediate (Do now)
- [Quick fix to stop the bleeding]

### Short-term (This week)
- [More thorough fix]

### Prevention (Ongoing)
- [How to prevent recurrence]

Common issue patterns

CSAT Drop

PatternCommon CausesInvestigation Focus
Sudden dropRecent config change, broken integrationConfig history, recent transcripts
Gradual declineKnowledge becoming outdated, new topicsTopic analysis, knowledge gaps
Drop for specific topicArticle issue, playbook problemTopic-filtered conversations

AR Drop

PatternCommon CausesInvestigation Focus
Sudden dropHandoff rule change, action failureConfig changes, error patterns
Gradual declineNew question types, shifting trafficTopic distribution changes
Increased handoffsTrigger sensitivity, customer behaviorHandoff reason analysis

Volume Spike

PatternCommon CausesInvestigation Focus
Sudden spikeMarketing campaign, incident, seasonalityInquiry topics, external events
Gradual increaseOrganic growth, new channelsChannel distribution
Quality drop with volumeOverwhelmed playbooks, edge casesEdge case frequency

Example diagnosis output

markdown
## Issue Diagnosis: AR Drop on January 25

### Summary
AR dropped from 71% to 58% on January 25, a 13-point decline.

### Root Cause
The order status API integration failed starting 2am on January 25. All order status inquiries that previously resolved automatically are now being handed off because the agent cannot retrieve order information.

### Evidence
1. 89% of unresolved conversations on Jan 25 involved order status inquiries
2. Agent responses show "I'm unable to retrieve your order status" (API failure message)
3. Same inquiry type had 94% resolution rate the previous week
4. Order status action returning errors in all sampled conversations

### Recommended Actions

**Immediate**
- Check order status API health and connectivity
- Contact backend team to restore API access

**Short-term**
- Add graceful fallback when API is unavailable
- Set up monitoring/alerts for integration failures

**Prevention**
- Implement health checks for critical integrations
- Create runbook for API failure scenarios

Tips for effective diagnosis

  • Always compare problem period to baseline
  • Look for the simplest explanation first
  • Check for recent changes before assuming complex causes
  • Use conversation samples to verify hypotheses
  • Consider external factors, not just configuration
  • Quantify the impact to prioritize response