Incident Investigator
Investigate Android authentication incidents systematically with evidence-first diagnosis.
Investigation Workflow
Execute these steps IN ORDER. Do not skip steps.
Step 1: Gather IcM Context
Query DRI Copilot MCP FIRST:
mcp_dricopilotdem_Broker_DRI_Copilot_Project_Explorer
Extract from IcM:
- •Affected app(s): Outlook, Teams, other 1P apps?
- •Account(s): Specific user or tenant-wide?
- •Device context: SDM enabled? Device model? Android version?
- •Symptoms: What exactly fails? Error messages?
- •Repro conditions: When does it happen vs. not happen?
Step 2: Extract Log Evidence
Search logs for these key patterns:
| Pattern | What It Tells You |
|---|---|
correlation_id: | Request tracking ID for eSTS correlation |
error_code or Error | Specific failure reason |
No PRT present | Missing Primary Refresh Token |
SignOut or removeAccount | Account removal events |
disabled by MDM | MDM policy interference |
invoked for package name: | Which app made the request |
executed successfully vs failed | Operation outcome |
Build a timeline of events with correlation IDs.
Step 3: Analyze Account/Token State
Check these indicators in logs:
| Log Message | Indicates |
|---|---|
Found [N] Accounts... | How many accounts in cache |
No PRT present for the account | PRT missing or wiped |
Home Account id doesn't have uid or tenant id | Incomplete account state |
Found more than one account entry | Duplicate account issue |
PRT is already registered-device PRT | Valid WPJ PRT exists |
Loading Workplace Join entry for tenant: | Device is WPJ'd |
Step 4: Identify Operation Flow
Map the operations that occurred:
| Operation | Purpose |
|---|---|
GetDeviceModeMsalBrokerOperation | Check if SDM enabled |
GetCurrentAccountMsalBrokerOperation | Fetch signed-in account |
AcquireTokenSilentMsalBrokerOperation | Silent token acquisition |
AcquireTokenInteractiveMsalBrokerOperation | Interactive auth |
SignOutFromSharedDeviceMsalBrokerOperation | SDM sign-out (⚠️ key for SDM issues) |
GetPreferredAuthMethodMsalBrokerOperation | Auth method check |
Step 5: Form Hypotheses
Rank by evidence strength:
| Confidence | Criteria |
|---|---|
| HIGH | Direct log evidence shows the issue |
| MEDIUM | Logs suggest but don't confirm |
| LOW | Inference based on patterns, no direct evidence |
Common root causes to consider:
- •MDM triggering sign-out (Imprivata, other MDMs)
- •PRT deleted/expired/revoked
- •Device cap reached
- •Account-specific CA policy
- •SDM misconfiguration
- •Broker/app version incompatibility
Step 6: Identify Missing Evidence
State explicitly what's NOT in the logs that would help:
- •Missing correlation IDs?
- •No sign-out operation captured?
- •No eSTS error codes?
- •Logs from wrong time window?
Output Format
## Investigation: IcM [Number] ### IcM Summary | Field | Value | |-------|-------| | Affected App(s) | | | Account | | | Device | Android [version], Broker [version] | | SDM Enabled | Yes/No | | Symptoms | | ### Key Correlation IDs | Correlation ID | Operation | Result | |----------------|-----------|--------| | `abc-123...` | AcquireTokenSilent | ✅/❌ | ### Evidence from Logs #### Finding 1: [Description] - **Timestamp**: - **Evidence**: [Exact log line] - **Implication**: ### Hypotheses (Ranked by Evidence) | # | Hypothesis | Confidence | Supporting Evidence | |---|------------|------------|---------------------| | 1 | | HIGH/MED/LOW | | ### Missing Evidence - [ ] [What additional data is needed] ### Recommended Actions 1. [Next step] 2. [Next step]
Common Patterns
Pattern: MDM-Triggered Sign-Out (SDM)
Symptoms: User signs in, immediately signed out Evidence to look for:
- •
SignOutFromSharedDeviceMsalBrokerOperationfrom MDM package - •
disabled by MDMmessages - •
No PRT presentafter successful auth
Pattern: Missing PRT
Symptoms: Silent auth fails, interactive works Evidence to look for:
- •
No PRT present for the account - •Check if
AcquireTokenSilentfails butAcquireTokenInteractivesucceeds - •Look for prior sign-out or PRT revocation
Pattern: Device Cap
Symptoms: New device can't register Evidence to look for:
- •Error during device registration
- •eSTS error about device limit
- •Check eSTS logs with correlation ID
Pattern: Duplicate Accounts
Symptoms: Inconsistent auth behavior Evidence to look for:
- •
Found more than one account entry for user - •Multiple accounts with same UPN but different home account IDs
DRI Copilot Queries
Initial Query (always start here)
When given just an incident ID, query DRI Copilot with:
"Investigate IcM [number]. What are the affected apps, symptoms, and known issues?"
This single query extracts:
- •Affected application(s)
- •Customer-reported symptoms
- •Account/device context
- •Any known root cause or past similar incidents
Follow-up Queries (after initial context)
Once you have context from the initial query, use targeted follow-ups:
"TSG for error code [error_code]" # After finding error in logs "Past incidents related to [symptom]" # After identifying symptom from IcM "How to troubleshoot [specific_issue]" # For deep-dive guidance
eSTS Correlation
Use the Kusto MCP tool to correlate with eSTS when needed:
mcp_my-mcp-server_execute_query
Parameters:
- •cluster:
https://estswus2.kusto.windows.net - •database:
ESTS - •query: (see below)
Basic correlation query:
AllPerRequestTable | where env_time >= ago(7d) | where DevicePlatformForUI == "Android" | where CorrelationId == "[correlation-id]" | project env_time, CorrelationId, Call, Result, ErrorCode, PrtData
For more Kusto queries, see references/kusto-queries.md.
Key Reminders
- •Query DRI Copilot FIRST - Get IcM context before analyzing logs
- •Evidence over assumptions - Only state what logs show
- •State what's missing - Be explicit about evidence gaps
- •Search all log files - Issue may span multiple log segments
- •Check for sign-out operations - Critical for SDM issues