Skill: Metrics Reading and Diagnosis
Purpose
Read available team and delivery metrics, identify meaningful patterns, and produce a system-level diagnosis that can guide workflow or skill improvements.
This skill is diagnostic. It does NOT apply changes. It is designed for Option B governance: human approval required for process changes.
Trigger
Use this skill whenever ANY of the following occur:
- •periodic team health review is requested
- •repeated failures or regressions are suspected
- •a spike in bugs, rework loops, or verification failures is observed
- •there is frequent HANDOFF contention or blocking
- •the Lead is asked “what should we improve in our process?”
Authoritative references
This skill treats the following as authoritative sources of metrics signals:
- •
docs/quality/BUG_LOG.md - •
docs/team/HANDOFF.md - •feature artifacts under
docs/features/<feature-slug>/ - •
docs/team/WORKFLOW.md(for interpreting signals against defined phases and gates)
If additional metrics sources exist (e.g. docs/metrics/*), they may be used, but
the above sources are the baseline.
Inputs
- •time window for analysis (explicit, or default to last 30 days)
- •current workflow topology (
docs/team/WORKFLOW.md) - •metrics sources listed above
Outputs
This skill MUST produce a Metrics Review Entry appended to:
- •
docs/team/METRICS_REVIEW.md
Each entry MUST include:
- •analysis window
- •signals observed (with counts/trends where possible)
- •diagnosis hypotheses (system-level)
- •confidence level per hypothesis
- •recommended next action(s)
- •whether escalation to a process proposal is warranted
This skill MUST NOT modify workflow, skills, prompts, or agent definitions.
Scope and posture
What this skill optimizes for
- •early detection of systemic issues
- •clear, testable hypotheses
- •actionable recommendations
- •avoidance of metric gaming
What this skill avoids
- •individual blame or scoring
- •single-metric optimization
- •conclusions without evidence
- •changes justified by one-off events
Signals to extract (minimum set)
This skill MUST attempt to extract and comment on:
A) Quality signals (from BUG_LOG)
- •total bugs opened in window
- •bugs by severity (S0–S3)
- •bugs by detected phase
- •bugs by injected phase (suspected)
- •bugs linked to specific features (if tracked)
- •recurrence of similar root causes
B) Flow signals (from feature artifacts and HANDOFF)
- •rework loops (feature sent back after review/validation)
- •late contract changes (contract changes during implementation)
- •verification gate failures
- •frequent blockers or long-lived blockers in HANDOFF
- •contention indicators (overlapping lock conflicts, repeated waiting)
C) Phase health signals (relative to WORKFLOW)
- •gates frequently bypassed or “deferred”
- •missing artifacts detected late
- •work starting without contract freeze or without discovery outputs
If any of these cannot be derived due to missing data, the skill MUST state that explicitly as a finding.
Diagnosis rules (strict)
- •Prefer system explanations over agent explanations.
- •Do not claim causality unless evidence supports it.
- •Treat metrics as signals; produce hypotheses with confidence levels.
- •Tie each hypothesis to at least one concrete observed signal.
- •If evidence is insufficient, state uncertainty and recommend instrumentation.
This skill MUST NOT produce recommendations that require immediate process edits. It may recommend creating a separate process improvement proposal.
Diagnosis structure (required)
For each diagnosis hypothesis, use this structure:
- •Hypothesis: <systemic issue>
- •Evidence: <signals observed>
- •Likely impact: <what this causes>
- •Confidence: high / medium / low
- •Recommended action: <next step, not a process change>
- •Escalate to proposal: yes / no
Examples of valid recommended actions:
- •“Run a deeper review of contract-freeze compliance across last 5 features.”
- •“Add a template to standardize verification recording.”
- •“Identify 3 most common root causes in S1 bugs.”
Examples of invalid recommended actions (too direct):
- •“Change the workflow to X.” (belongs in the proposal skill)
- •“Rewrite the agent prompt.” (belongs in the proposal skill)
Explicit non-responsibilities
This skill MUST NOT:
- •modify
docs/team/WORKFLOW.md - •modify skills or prompts
- •reassign agent responsibilities
- •approve features
- •optimize for speed or throughput targets
- •produce individual performance rankings or punitive conclusions
It only diagnoses system health and recommends next steps.
Interaction with other skills
This skill often precedes:
- •
process-improvement-proposal→ when patterns justify a controlled, human-approved change
It may reference outputs from:
- •
bug-management - •
bug-phase-classification - •
workflow-enforcement - •
handoff-coordination - •
verification-gate
But it does not replace any of them.
Failure handling
If metrics are missing or too sparse to diagnose:
- •Record that diagnosis is constrained by missing data
- •List which signals could not be computed and why
- •Recommend minimal instrumentation additions (e.g., templates or required fields)
- •Do NOT fabricate conclusions
Acceptance criteria
This skill is successful when:
- •it produces a clear, evidence-backed metrics review entry
- •it identifies trends across multiple features (not one-off noise)
- •it distinguishes symptoms from likely causes
- •it avoids individual blame
- •it results in actionable follow-up (analysis or proposal) when warranted
- •it highlights missing instrumentation as an explicit finding
Design principle
Metrics exist to improve the system that builds the product.
If metrics become targets or tools for blame, this skill is being misused.