AgentSkillsCN

operational-readiness

Reown 服务的运营就绪检查清单。适用于服务负责人希望:检查生产就绪度、在上线前验证服务、开展运营就绪评审、审计服务合规性、确认服务是否已准备好投入生产,或验证基础设施与安全态势时使用。 触发条件:“运营就绪”、“生产就绪”、“上线检查清单”、“服务评审”、“上线前审计”、“ORC”、“我的服务准备好了吗?”、“检查我的服务”、“就绪评审”

SKILL.md
--- frontmatter
name: operational-readiness
description: |
  Operational Readiness Checklist for Reown services. Use when service owners ask to: check production readiness, validate a service before launch, run operational readiness review, audit service compliance, check if service is ready for production, or validate infrastructure/security posture.

  Triggers: "operational readiness", "production readiness", "launch checklist", "service review", "pre-launch audit", "ORC", "is my service ready", "check my service", "readiness review"

Operational Readiness Checklist

Comprehensive checklist to validate services before production launch. Analyzes codebase + asks interactive questions for items that cannot be detected from code.

Workflow Overview

  1. Gather context - Identify service type, tech stack, and traffic expectations
  2. Analyze codebase - Scan for CI/CD configs, infrastructure code, security patterns
  3. Interactive verification - Ask about items that cannot be detected from code
  4. Generate report - Produce checklist report with priorities and remediation guidance

Step 1: Gather Context

Ask the user these questions using AskUserQuestion:

Service Classification:

  • Service type: Backend API, Frontend/Web App, Infrastructure/Platform, or Hybrid
  • Expected traffic: <100 req/min (low), 100-1000 req/min (medium), >1000 req/min (high)
  • Data handling: Stores user data (yes/no), Processes PII (yes/no)
  • Public-facing: Yes/No
  • Has email functionality: Yes/No
  • Uses database: Yes/No (if yes, which: PostgreSQL, Supabase, DynamoDB, etc.)

Tech Stack Detection: Auto-detect from files:

  • Cargo.toml → Rust service
  • package.json → Node.js/TypeScript
  • *.tf or *.tfvars → Terraform
  • cdk.json or *.cdk.ts → AWS CDK
  • .github/workflows/*.yml → GitHub Actions CI/CD
  • next.config.js → Next.js frontend
  • Dockerfile → Containerized service

Step 2: Codebase Analysis

Analyze the codebase for evidence of checklist items. Use Glob and Grep to find:

CI/CD Detection:

code
.github/workflows/*.yml - GitHub Actions
Cargo.toml + [profile.release] - Rust build config
jest.config.* / vitest.config.* - Test configuration
*.tf - Terraform files
cdk.json - CDK configuration

Security Detection:

code
**/security*.yml - Security scanning workflows
dependabot.yml - Dependency updates
CODEOWNERS - Code ownership
*.lock files - Dependency locking

Observability Detection:

code
**/tracing*.rs or opentelemetry* - Distributed tracing
sentry.* or @sentry/* - Error tracking
prometheus* or metrics* - Metrics collection
**/logging*.* or log4* or tracing* - Logging config

Infrastructure Detection:

code
**/autoscaling* in .tf files - Autoscaling config
**/secretsmanager* or **/ssm* - Secrets management
health* endpoints in code - Health checks

Step 3: Interactive Verification

For items that cannot be detected from code, ask yes/no questions. Group questions by category to avoid overwhelming the user.

Step 4: Generate Report

Output format:

markdown
# Operational Readiness Report: [Service Name]

**Service Type:** [Backend API / Frontend / Infrastructure]
**Tech Stack:** [Detected stack]
**Generated:** [Date]

## Summary
- **Overall Readiness:** [X/Y items passing] ([Z%])
- **Launch Blockers (P0):** [count]
- **High Priority (P1):** [count]
- **Medium Priority (P2):** [count]
- **Low Priority (P3):** [count]

## Observability
| Item | Status | Priority | Notes |
|------|--------|----------|-------|
| ... | ✅/❌/⚠️ | P0-P3 | ... |

[Repeat for each category]

## Remediation Summary
[List failing items with links to remediation guidance]

Checklist Items by Category

Observability (O11Y)

ItemPriorityApplies ToDetection Method
Alarmable top-level metric OR Canary (OpsGenie integrated)P0High traffic (>100 req/min)Ask
Canary coverage (if <100 req/min)P0Low trafficAsk
DB/Queue monitoring (CPU/Disk/Memory)P1Services with DB/QueueAsk
Logging configured and viewableP1AllGrep for logging config
Log retention policy (min 1 year for SOC2)P1AllAsk
Distributed tracing (OpenTelemetry/Jaeger)P2Backend servicesGrep for otel/tracing
Sentry instrumentationP1Frontend onlyGrep for @sentry
status.reown.com integrationP3Public-facingAsk

Remediation: See references/remediation-o11y.md


CI/CD & Testing

ItemPriorityApplies ToDetection Method
CI runs unit/functional tests (>80% critical path coverage)P0AllCheck workflow files
CD runs integration/e2e testsP1AllCheck workflow files
Load testing performedP1High traffic / user-facingAsk
Rollback procedure documented and testedP1AllAsk
Post-deploy health checksP2AllCheck workflow files

Remediation: See references/remediation-cicd.md


Primitives (Infrastructure)

ItemPriorityApplies ToDetection Method
Runbook documented (failure modes, troubleshooting, escalation)P0AllAsk
Infrastructure as code (Terraform/CDK)P0AllCheck for .tf or cdk files
Autoscaling configuredP1Backend servicesGrep .tf for autoscaling
Healthcheck endpoint (memory, filesystem, dependencies)P1AllGrep for /health endpoint
Multi-AZ deployment (2+ pods/instances)P1AllAsk
Secrets management (AWS SM, Vault) - no secrets in codeP0AllGrep for hardcoded secrets, check .tf
Configuration management (env separation)P2AllCheck for env-specific configs
Data Lake integrationP3Analytics needsAsk

Remediation: See references/remediation-primitives.md


Security

ItemPriorityApplies ToDetection Method
OWASP Top 10 2025 validationP0AllAsk
Secure design review (threat modeling)P1AllAsk
Dependency scanning enabled + SBOMP1AllCheck for dependabot, snyk
Software/data integrity (code signing, CI/CD security)P2AllAsk
Fail-secure exception handlingP1AllCode review
Service-to-service auth (mTLS, JWT, API keys)P1Backend with internal APIsAsk
Clickjacking headers (X-Frame-Options, CSP)P1Frontend onlyGrep for security headers
SPF recordsP2Services with emailAsk
DKIM recordsP2Services with emailAsk
RLS policies (Supabase/DB)P0Services with SupabaseAsk
Rate limitingP1Public APIsGrep for rate limit config
DDoS protection (Cloudflare/AWS Shield)P1Public-facingAsk
API authenticationP1Public APIsGrep for auth middleware
Audit logging (auth, admin, data access)P2AllGrep for audit log

Remediation: See references/remediation-security.md


3rd Party Services

ItemPriorityApplies ToDetection Method
Metrics integration for 3rd partiesP2Services using 3rd partiesAsk
Status page integration (Slack channel minimum)P2Services using 3rd partiesAsk
RPC rate limits configuredP1Services using RPCsAsk

Remediation: See references/remediation-dependencies.md


Service Dependencies

ItemPriorityApplies ToDetection Method
Upstream dependencies documentedP1AllAsk
Downstream dependencies documentedP1AllAsk
Dependency health in service health endpointP2AllCode review
Fallback behavior for non-critical depsP2AllAsk

Remediation: See references/remediation-dependencies.md


Data Retention & Privacy

ItemPriorityApplies ToDetection Method
Data retention policy definedP1Services with persistent dataAsk
GDPR: Personal data identifiedP1Services handling user dataAsk
GDPR: DSAR process definedP1Services handling user dataAsk
GDPR: Right to be forgotten processP1Services handling user dataAsk
Privacy policy updatedP2User-facing servicesAsk
DPAs with third-party processorsP2Services sharing dataAsk

Remediation: See references/remediation-privacy.md


Efficiency & Frugality

ItemPriorityApplies ToDetection Method
Resource-efficient implementationP2AllCode review
Cost scaling model documentedP2AllAsk
Spend caps / usage alerts configuredP2AllAsk
FinOps review completedP3AllAsk

Remediation: See references/remediation-efficiency.md


Priority Definitions

PriorityMeaningAction Required
P0Launch blockerMust fix before production
P1High priorityFix within current sprint
P2Medium priorityFix within quarter
P3Nice to haveAddress when convenient

Status Indicators

  • Pass - Item verified as compliant
  • Fail - Item not compliant, needs remediation
  • ⚠️ Partial - Partially compliant, improvements needed
  • N/A - Not applicable to this service type