AgentSkillsCN

sf-ai-agentforce-observability

从Salesforce Data Cloud中提取并分析Agentforce会话追踪数据。支持高吞吐量提取(每日100万条记录),基于Polars进行数据分析,并为Agent会话的调试与优化提供有力支持。

SKILL.md
--- frontmatter
name: sf-ai-agentforce-observability
description: >
  Extract and analyze Agentforce session tracing data from Salesforce Data Cloud.
  Supports high-volume extraction (1-10M records/day), Polars-based analysis,
  and debugging workflows for agent sessions.
license: MIT
compatibility: "Requires Data Cloud enabled org with Agentforce Session Tracing"
metadata:
  version: "1.0.0"
  author: "Jag Valaiyapathy"
  data_model: "Session Tracing Data Model (STDM)"
  storage_format: "Parquet (via PyArrow)"
  analysis_library: "Polars"
hooks:
  PreToolUse:
    - matcher: Bash
      hooks:
        - type: command
          command: "python3 ${SHARED_HOOKS}/scripts/guardrails.py"
          timeout: 5000
  PostToolUse:
    - matcher: "Write|Edit"
      hooks:
        - type: command
          command: "python3 ${SKILL_HOOKS}/validate-extraction.py"
          timeout: 10000
        - type: command
          command: "python3 ${SHARED_HOOKS}/suggest-related-skills.py sf-ai-agentforce-observability"
          timeout: 5000
  SubagentStop:
    - type: command
      command: "python3 ${SHARED_HOOKS}/scripts/chain-validator.py sf-ai-agentforce-observability"
      timeout: 5000
<!-- TIER: 1 | ENTRY POINT --> <!-- This is the starting document - read this FIRST --> <!-- Pattern: Follows sf-data for Python extraction scripts -->

sf-ai-agentforce-observability: Agentforce Session Tracing Extraction & Analysis

Expert in extracting and analyzing Agentforce session tracing data from Salesforce Data Cloud. Supports high-volume data extraction (1-10M records/day), Parquet storage, and Polars-based analysis for debugging agent behavior.

Core Responsibilities

  1. Session Extraction: Extract STDM (Session Tracing Data Model) data via Data Cloud Query API
  2. Data Storage: Write to Parquet format with PyArrow for efficient storage
  3. Analysis: Polars-based lazy evaluation for memory-efficient analysis
  4. Debugging: Session timeline reconstruction for troubleshooting agent issues
  5. Cross-Skill Integration: Works with sf-connected-apps for auth, sf-ai-agentscript for fixes

Document Map

NeedDocumentDescription
Quick startREADME.mdInstallation & basic usage
Data modelresources/data-model-reference.mdFull STDM schema documentation
Query patternsresources/query-patterns.mdData Cloud SQL examples
Analysis recipesresources/analysis-cookbook.mdCommon Polars patterns
CLI referencedocs/cli-reference.mdComplete command documentation
Auth setupdocs/auth-setup.mdJWT Bearer configuration
Troubleshootingresources/troubleshooting.mdCommon issues & fixes

Quick Links:


CRITICAL: Prerequisites Checklist

Before extracting session data, verify:

CheckHow to VerifyWhy
Data Cloud enabledSetup → Data CloudRequired for Query API
Agentforce activatedSetup → AgentforceGenerates session data
Session Tracing enabledAgent SettingsMust be ON to collect data
JWT Auth configuredUse sf-connected-appsRequired for Data Cloud API

Auth Setup (via sf-connected-apps)

bash
# 1. Create key directory
mkdir -p ~/.sf/jwt

# 2. Generate certificate (naming convention: {org}-agentforce-observability)
openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 \
  -keyout ~/.sf/jwt/myorg-agentforce-observability.key \
  -out ~/.sf/jwt/myorg-agentforce-observability.crt \
  -subj "/CN=AgentforceObservability/O=MyOrg"

# 3. Secure the private key
chmod 600 ~/.sf/jwt/myorg-agentforce-observability.key

# 4. Create External Client App in Salesforce (see docs/auth-setup.md)
# Required scopes: cdp_query_api, refresh_token/offline_access

Key Path Resolution Order:

  1. Explicit --key-path argument
  2. App-specific: ~/.sf/jwt/{org}-agentforce-observability.key
  3. Generic fallback: ~/.sf/jwt/{org}.key

See docs/auth-setup.md for detailed instructions.


Session Tracing Data Model (STDM)

The STDM consists of 4 Data Model Objects (DMOs) in a hierarchical structure:

code
ssot__AIAgentSession__dlm (SESSION)
├── ssot__Id__c                          # Session ID
├── ssot__AIAgentApiName__c              # Agent API name
├── ssot__StartTimestamp__c              # Session start
├── ssot__EndTimestamp__c                # Session end
├── ssot__AIAgentSessionEndType__c       # End type (Completed, Abandoned, etc.)
├── ssot__RelatedMessagingSessionId__c   # Linked messaging session
└── ssot__OrganizationId__c              # Org ID

    └── ssot__AIAgentInteraction__dlm (TURN/SESSION_END)  [1:N]
        ├── ssot__Id__c                          # Interaction ID
        ├── ssot__aiAgentSessionId__c            # FK to Session
        ├── ssot__InteractionType__c             # TURN or SESSION_END
        ├── ssot__TopicApiName__c                # Topic that handled this turn
        ├── ssot__StartTimestamp__c              # Turn start
        └── ssot__EndTimestamp__c                # Turn end

            ├── ssot__AIAgentInteractionStep__dlm (STEP)  [1:N]
            │   ├── ssot__Id__c                          # Step ID
            │   ├── ssot__AIAgentInteractionId__c        # FK to Interaction
            │   ├── ssot__AIAgentInteractionStepType__c  # LLM_STEP or ACTION_STEP
            │   ├── ssot__Name__c                        # Action/step name
            │   ├── ssot__InputValueText__c              # Input to step
            │   ├── ssot__OutputValueText__c             # Output from step
            │   ├── ssot__PreStepVariableText__c         # Variables before
            │   ├── ssot__PostStepVariableText__c        # Variables after
            │   └── ssot__GenerationId__c                # LLM generation ID

            └── ssot__AIAgentMoment__dlm (MESSAGE)  [1:N]
                ├── ssot__Id__c                              # Message ID
                ├── ssot__AIAgentInteractionId__c            # FK to Interaction
                ├── ssot__ContentText__c                     # Message content
                ├── ssot__AIAgentInteractionMessageType__c   # INPUT or OUTPUT
                └── ssot__MessageSentTimestamp__c            # Timestamp

See resources/data-model-reference.md for full field documentation.


Workflow (5-Phase Pattern)

Phase 1: Requirements Gathering

Use AskUserQuestion to gather:

#QuestionOptions
1Target orgOrg alias from sf org list
2Time rangeLast N days / Date range
3Agent filterAll agents / Specific API names
4Output formatParquet (default) / CSV
5Analysis typeSummary / Debug session / Full extraction

Phase 2: Auth Configuration

Verify JWT auth is configured:

python
from scripts.auth import DataCloudAuth

auth = DataCloudAuth(
    org_alias="myorg",
    consumer_key="YOUR_CONSUMER_KEY"
)

# Test authentication
token = auth.get_token()
print(f"Auth successful: {token[:20]}...")

If auth fails, invoke:

code
Skill(skill="sf-connected-apps", args="Setup JWT Bearer for Data Cloud")

Phase 3: Extraction

Basic Extraction (last 7 days):

bash
python3 scripts/cli.py extract \
  --org prod \
  --days 7 \
  --output ./stdm_data

Filtered Extraction:

bash
python3 scripts/cli.py extract \
  --org prod \
  --since 2026-01-01 \
  --until 2026-01-28 \
  --agent Customer_Support_Agent \
  --output ./stdm_data

Session Tree (specific session):

bash
python3 scripts/cli.py extract-tree \
  --org prod \
  --session-id "a0x..." \
  --output ./debug_session

Phase 4: Analysis

Session Summary:

python
from scripts.analyzer import STDMAnalyzer
from pathlib import Path

analyzer = STDMAnalyzer(Path("./stdm_data"))

# High-level summary
summary = analyzer.session_summary()
print(summary)

# Step distribution by agent
steps = analyzer.step_distribution(agent_name="Customer_Support_Agent")
print(steps)

# Topic routing analysis
topics = analyzer.topic_analysis()
print(topics)

Debug Specific Session:

bash
python3 scripts/cli.py debug-session \
  --data-dir ./stdm_data \
  --session-id "a0x..."

Phase 5: Integration & Next Steps

Based on analysis findings:

FindingNext StepSkill
Topic mismatchImprove topic descriptionssf-ai-agentscript
Action failuresDebug Flow/Apexsf-flow, sf-debug
Slow responsesOptimize actionssf-apex
Missing coverageAdd test casessf-ai-agentforce-testing

CLI Quick Reference

Extraction Commands

CommandPurposeExample
extractExtract session dataextract --org prod --days 7
extract-treeExtract full session treeextract-tree --org prod --session-id "a0x..."
extract-incrementalResume from last runextract-incremental --org prod

Analysis Commands

CommandPurposeExample
analyzeGenerate summary statsanalyze --data-dir ./stdm_data
debug-sessionTimeline viewdebug-session --session-id "a0x..."
topicsTopic analysistopics --data-dir ./stdm_data

Common Flags

FlagDescriptionDefault
--orgTarget org aliasRequired
--consumer-keyECA consumer key$SF_CONSUMER_KEY env var
--key-pathJWT private key path~/.sf/jwt/{org}-agentforce-observability.key
--daysLast N days7
--sinceStart date (YYYY-MM-DD)-
--untilEnd date (YYYY-MM-DD)Today
--agentFilter by agent API nameAll
--outputOutput directory./stdm_data
--verboseDetailed loggingFalse
--formatOutput format (table/json/csv)table

See docs/cli-reference.md for complete documentation.


Analysis Examples

Session Summary

code
📊 SESSION SUMMARY
════════════════════════════════════════════════════════════════

Period: 2026-01-21 to 2026-01-28
Total Sessions: 15,234
Unique Agents: 3

SESSIONS BY AGENT
────────────────────────────────────────────────────────────────
Agent                          │ Sessions │ Avg Turns │ Avg Duration
───────────────────────────────┼──────────┼───────────┼─────────────
Customer_Support_Agent         │   8,502  │    4.2    │     3m 15s
Order_Tracking_Agent           │   4,128  │    2.8    │     1m 45s
Product_FAQ_Agent              │   2,604  │    1.9    │       45s

END TYPE DISTRIBUTION
────────────────────────────────────────────────────────────────
✅ Completed:    12,890 (84.6%)
🔄 Escalated:     1,523 (10.0%)
❌ Abandoned:       821 (5.4%)

Debug Session Timeline

code
🔍 SESSION DEBUG: a0x1234567890ABC
════════════════════════════════════════════════════════════════

Agent: Customer_Support_Agent
Started: 2026-01-28 10:15:23 UTC
Duration: 4m 32s
End Type: Completed
Turns: 5

TIMELINE
────────────────────────────────────────────────────────────────
10:15:23 │ [INPUT]  "I need help with my order #12345"
10:15:24 │ [TOPIC]  → Order_Tracking (confidence: 0.95)
10:15:24 │ [STEP]   LLM_STEP: Identify intent
10:15:25 │ [STEP]   ACTION_STEP: Get_Order_Status
         │          Input: {"orderId": "12345"}
         │          Output: {"status": "Shipped", "eta": "2026-01-30"}
10:15:26 │ [OUTPUT] "Your order #12345 has shipped and will arrive by Jan 30."

10:16:01 │ [INPUT]  "Can I change the delivery address?"
10:16:02 │ [TOPIC]  → Order_Tracking (same topic)
10:16:02 │ [STEP]   LLM_STEP: Clarify request
10:16:03 │ [STEP]   ACTION_STEP: Check_Modification_Eligibility
         │          Input: {"orderId": "12345", "type": "address_change"}
         │          Output: {"eligible": false, "reason": "Already shipped"}
10:16:04 │ [OUTPUT] "I'm sorry, the order has already shipped..."

Cross-Skill Integration

Prerequisite Skills

SkillWhenHow to Invoke
sf-connected-appsAuth setupSkill(skill="sf-connected-apps", args="JWT Bearer for Data Cloud")

Follow-up Skills

FindingSkillHow to Invoke
Topic routing issuessf-ai-agentscriptSkill(skill="sf-ai-agentscript", args="Fix topic: [issue]")
Action failuressf-flow / sf-debugSkill(skill="sf-debug", args="Analyze agent action failure")
Test coverage gapssf-ai-agentforce-testingSkill(skill="sf-ai-agentforce-testing", args="Add test cases")

Commonly Used With

SkillUse CaseConfidence
sf-ai-agentscriptFix agent based on trace analysis⭐⭐⭐ Required
sf-ai-agentforce-testingCreate test cases from observed patterns⭐⭐ Recommended
sf-debugDeep-dive into action failures⭐⭐ Recommended

Key Insights

InsightDescriptionAction
STDM is read-onlyData Cloud stores traces; cannot modifyUse for analysis only
Session lagData may lag 5-15 minutesDon't expect real-time
Volume limitsQuery API: 10M records/dayUse incremental extraction
Parquet efficiency10x smaller than JSONAlways use Parquet for storage
Lazy evaluationPolars scans without loadingHandles 100M+ rows

Common Issues & Fixes

ErrorCauseFix
401 UnauthorizedJWT auth expired/invalidRefresh token or reconfigure ECA
No session dataTracing not enabledEnable Session Tracing in Agent Settings
Query timeoutToo much dataAdd date filters, use incremental
Memory errorLoading all dataUse Polars lazy frames
Missing DMOWrong API versionUse API v60.0+

See resources/troubleshooting.md for detailed solutions.


Output Directory Structure

After extraction:

code
stdm_data/
├── sessions/
│   └── date=2026-01-28/
│       └── part-0000.parquet
├── interactions/
│   └── date=2026-01-28/
│       └── part-0000.parquet
├── steps/
│   └── date=2026-01-28/
│       └── part-0000.parquet
├── messages/
│   └── date=2026-01-28/
│       └── part-0000.parquet
└── metadata/
    ├── extraction.json      # Extraction parameters
    └── watermark.json       # For incremental extraction

Dependencies

Python 3.10+ with:

code
polars>=1.0.0           # DataFrame library (lazy evaluation)
pyarrow>=15.0.0         # Parquet support
pyjwt>=2.8.0            # JWT generation
cryptography>=42.0.0    # Certificate handling
httpx>=0.27.0           # HTTP client
rich>=13.0.0            # CLI progress bars
click>=8.1.0            # CLI framework
pydantic>=2.6.0         # Data validation

Install: pip install -r requirements.txt


License

MIT License. See LICENSE file. Copyright (c) 2024-2026 Jag Valaiyapathy