Observability Analysis
You are the platform-engineer for {{PROJECT_NAME}}. Generate a structured observability report by querying error tracking and analytics APIs, then cross-reference findings with the codebase to identify instrumentation gaps.
Prerequisites
Validate required environment variables before proceeding. If any are missing, print the setup instructions and stop.
Configure your error tracking tool (e.g., Sentry) and analytics tool (e.g., PostHog) environment variables as needed for your project.
Period Argument
Parse the optional argument $ARGUMENTS (default: week):
| Argument | Query Period | Label |
|---|---|---|
week (default) | 7 days | Last 7 days |
month | 30 days | Last 30 days |
sprint | 7 days | Current sprint (same as week, labeled differently) |
Step 0: Codebase Instrumentation Audit
Before querying external APIs, scan the codebase to build a current inventory of instrumentation. This ensures queries match reality.
Span/Trace Inventory
ALWAYS grep first. Do not rely on any static list — the codebase is the source of truth.
Search for performance measurement calls, custom spans, and trace instrumentation in your codebase. Build a table from the grep results:
| Span Name | File | Purpose |
|---|
Also check for:
- •Error capture calls (e.g.,
captureException) - •Custom tags/context
- •Breadcrumbs (navigation/state changes)
- •Performance measurements
Analytics Events Inventory
ALWAYS grep first. Do not rely on any static list — the codebase is the source of truth.
Search for analytics event definitions and their usage in non-test code. Mark each event as:
- •Active = capture call found in non-test code
- •Unwired = defined but no capture call found in non-test code
Gap Analysis
Flag any of these as observability gaps:
- •Unwired events — defined but never called. These represent intended instrumentation that was never connected.
- •Missing error capture — code paths with
try/catchthat swallow errors without reporting. - •Missing spans — significant async operations (database queries, network calls, file I/O) without performance tracking.
- •Missing breadcrumbs — navigation transitions or state changes without breadcrumb logging.
Step 1: Query Error Tracking
Query your error tracking tool (Sentry, Bugsnag, etc.) for the configured period.
Collect:
- •Crash-free rate — session health over the period
- •Top unresolved issues — top 10 by event count
- •Error trends — new vs resolved issues over the period
- •Performance p50/p95 — for all instrumented spans found in Step 0
- •Release health — recent releases, crash-free users, new issues per release
- •Screen/page performance — Time to Initial Display or equivalent per screen/route
- •App start performance — cold start and warm start times
- •Frame rates / UI performance — slow and frozen frame rates per screen (if applicable)
- •All transactions overview — query ALL transactions sorted by volume
If any query returns a rate limit error, wait 10 seconds and retry once. If it fails again, note "Rate limited" in that section.
Step 2: Query Analytics
Query your analytics tool (PostHog, Amplitude, Mixpanel, etc.) for the configured period.
Collect:
- •Retention — D1, D7, D30 cohorts
- •Engagement — weekly active usage metrics relevant to your product
- •Funnel completion — key user journey completion rates
- •Onboarding funnel — first-time user conversion
- •Feature adoption — event counts for key features
- •DAU/WAU — daily and weekly active users trend
- •Error events — application-level error event counts and trends
If any query times out, fall back to simpler event count queries and note the limitation.
Step 3: Correlate and Analyze
Cross-reference error tracking and analytics data:
- •Crash impact: Do top errors correlate with drop-offs in analytics funnels?
- •Performance vs engagement: Do performance regressions (p95 increases) correlate with lower retention?
- •Feature adoption vs errors: Are newly adopted features generating disproportionate errors?
- •Instrumentation coverage: Compare the codebase inventory (Step 0) against what was actually received. If a span is instrumented but has zero events, something is broken.
If data is insufficient for correlation (pre-beta, low traffic), explicitly state: "Insufficient production data for cross-platform correlation. Analysis based on available development/testing traffic."
Step 4: Generate Report
Save the report to:
docs/reports/observability-report-YYYY-MM-DD.md
Where YYYY-MM-DD is today's date.
Step 5: Action Items
For each identified issue, create an action item with:
- •Title: Short description (suitable for a GitHub issue title)
- •Severity: P0 (critical), P1 (high), P2 (medium)
- •Evidence: Data points from error tracking/analytics supporting the finding
- •Recommendation: Specific next step with file paths
- •Backlog command: Ready-to-use
/bugor/featurecommand
Severity guide:
- •P0: Crash-free rate <99%, data loss, security-related errors
- •P1: p95 >2x target, retention below target, error count trending up >50%
- •P2: Missing instrumentation, unwired events, minor performance degradation
Example:
### P1: Data processing p95 above target — 3.8s vs <2.5s target
**Evidence:** Error tracking span `data.process` p95 = 3800ms (target: <2500ms), 234 samples over 7 days
**File:** `{{APP_DIR}}services/pipeline/DataPipeline.ts:62`
**Recommendation:** Profile data processing — check if large inputs cause tail latency. Consider adding input size as a tag for segmentation.
**Backlog:** `/bug Data processing p95 is 3.8s vs 2.5s target — profile for tail latency.`
Step 6: Present Summary
Print a text summary to the user:
Observability Report — [period label] ====================================== App Health: Crash-free rate: XX.X% Sessions: N total, N crashed, N errored Top issue: [issue title] (N events) Performance — Custom Spans: [span name] p95: X.Xs (target: <Xs) ... Performance — Screen/Page Load: [screen] avg: Xs, p95: Xs (target: <2s) ... Performance — App Start: Cold start avg: Xms, p95: Xms (target: <2s) Warm start avg: Xms, p95: Xms (target: <1s) Engagement: DAU: N | WAU: N D7 retention: XX% (target: >40%) Key engagement metric: XX (target: >XX) Instrumentation Gaps: N unwired events, N missing spans Error Events: N total application-level errors Action Items: N total (P0: N, P1: N, P2: N) [list top 3 action items] Full report: docs/reports/observability-report-YYYY-MM-DD.md
Integration Points
With /sprint-close
When called from sprint-close, include a condensed "Observability Highlights" section:
- •Crash-free rate + trend
- •Top performance metric change
- •Key engagement metric change
- •Critical action items only
With /status
When called from status, provide these metrics for the dashboard:
- •
crashFreeRate: percentage - •
p95ColdStart: seconds - •
dau: number - •
d7Retention: percentage