AgentSkillsCN

cicd-debug-loop

通过工件升级,自动调试本仓库的从开发环境到生产环境的部署管道。

SKILL.md
--- frontmatter
name: cicd-debug-loop
description: Auto-debug this repo's dev→prod deployment pipeline with artifact promotion
license: MIT
compatibility: opencode
metadata:
  audience: maintainers
  workflow: build-and-deploy.yml

What I Do

I debug the deployment pipeline for this specific repo:

  • .github/workflows/build-and-deploy.yml
  • GCS buckets: dev.briananderson.xyz, briananderson.xyz
  • E2E tests: Playwright on Chromium
  • PostHog analytics integration

Pipeline Architecture

code
push to main → build job → deploy-dev → deploy-prod (manual)
                                          ↓
                                    E2E tests (dev)
                                          ↓
                                    smoke tests (prod)

My Debugging Process

1. Quick Check

bash
gh run list --workflow=build-and-deploy.yml --limit=5 --json conclusion,status,displayTitle

2. Identify Failing Job

  • build: PostHog secrets, TypeScript check, build, UI tests
  • deploy-dev: GCS upload, Playwright E2E tests
  • deploy-prod: GCS upload, smoke tests, rollback on failure

3. Job-Specific Debugging

Build Job:

  • Check PostHog secrets: gh secret list
  • PostHog validation script: site/scripts/test:posthog
  • TypeScript errors in build output

Deploy-Dev Job:

  • GCS authentication: Check GCP workload identity
  • Playwright issues:
    • Install: npx playwright install chromium
    • Tests: Use PLAYWRIGHT_TEST_BASE_URL env var
    • Skip webServer in CI: process.env.CI
  • DNS errors: dev.briananderson.xyz may not resolve

Deploy-Prod Job:

  • Manual trigger via GitHub UI
  • Approval required
  • Auto-rollback on smoke test failure

4. Apply Fix Pattern

PostHog Issues:

bash
# Check secrets
gh secret list

# Add missing secrets via GitHub UI or CLI
gh secret set PUBLIC_POSTHOG_KEY
gh secret set PUBLIC_POSTHOG_HOST

Playwright Issues:

bash
# CLI syntax error - use env var, not --base-url flag
env:
  PLAYWRIGHT_TEST_BASE_URL: https://dev.briananderson.xyz
run: npx playwright test

# Disable webServer in CI (playwright.config.ts)
webServer: process.env.CI ? undefined : { ... }

DNS Resolution:

  • Make E2E tests optional with continue-on-error: true
  • Or set up DNS for dev/prod domains

5. Push and Monitor

bash
git add .github/workflows/build-and-deploy.yml
git commit -m "fix: <description>"
git push origin main

# Wait for run
sleep 30
gh run list --workflow=build-and-deploy.yml --limit=2

6. Verify Deployment

bash
# Check dev
curl -I https://dev.briananderson.xyz

# Check prod (after manual trigger)
curl -I https://briananderson.xyz

Known Issues & Solutions

IssueSolution
PostHog secrets missingAdd PUBLIC_POSTHOG_KEY, PUBLIC_POSTHOG_HOST via GitHub UI
error: unknown option '--base-url=...'Use PLAYWRIGHT_TEST_BASE_URL env var instead
Cannot find package '@playwright/test'Install deps in deploy job before running tests
net::ERR_NAME_NOT_RESOLVEDAdd continue-on-error: true to E2E test step
Timeout on Playwright installAccept ~6 min download time, no fix needed

Testing Locally

bash
cd site
pnpm run validate
pnpm run build
pnpm preview -- --host
npx playwright test --base-url=http://localhost:4173

Manual Deploy to Prod

  1. Go to: Actions → Build and Deploy workflow
  2. Click "Run workflow"
  3. Check "deploy_prod" checkbox
  4. Approve when prompted

Manual Rollback

  1. Go to: Actions → Build and Deploy workflow
  2. Click "Run workflow"
  3. Check "rollback" checkbox
  4. Select target: dev or prod

When to Use Me

User asks: "My deployment pipeline is broken" User asks: "The workflow is stuck" User asks: "Tests are failing in CI"

Then load this skill and run through the debugging process.