Ab Testing
Trigger Boundary
- •Use when product-impacting changes require controlled experiment validation.
- •Do not use for deterministic functional verification; use
testing-*. - •Do not use for long-term reliability telemetry design; use
observability-*.
Goal
Produce statistically and operationally sound experiment decisions.
Inputs
- •Experiment hypothesis and expected user/business outcome
- •Current baseline metrics and traffic constraints
- •Risk thresholds, guardrails, and stop conditions
Outputs
- •Experiment plan with metric definitions and guardrails
- •Analysis plan and decision thresholds
- •Post-experiment decision record and follow-up actions
Workflow
- •Define hypothesis, target population, and experiment guardrails.
- •Define primary and secondary metrics with decision thresholds.
- •Validate randomization and sample-size assumptions.
- •Run experiment with safety monitors and stop criteria.
- •Analyze outcomes and publish decision with confidence bounds.
Quality Gates
- •Metrics and stop conditions are explicit and auditable.
- •Experiment population and randomization assumptions are documented.
- •Decision criteria are defined before outcome analysis.
- •Privacy and compliance checks pass for user data handling.
Failure Handling
- •Stop when metric definitions or stop conditions are missing.
- •Escalate when experiment risk exceeds agreed guardrails.