AgentSkillsCN

ci-cd

配置并修复 CI/CD 流水线:包括构建、测试、lint、部署、数据库迁移、环境配置、可观测性与发布流程。适用于用户询问 CI、流水线、GitHub Actions、部署、修复构建、环境变量、监控,或发布流程时使用。

SKILL.md
--- frontmatter
name: ci-cd
description: "Configure and fix CI/CD pipelines: build, test, lint, deploy, database migrations, environment config, observability, and releases. Use when the user asks about CI, pipeline, GitHub Actions, deploy, fix the build, environment variables, monitoring, or release process."
triggers:
  - "/ci"
  - "CI"
  - "pipeline"
  - "GitHub Actions"
  - "fix the build"
  - "deploy"
  - "release pipeline"
  - "lint in CI"
  - "run tests in CI"
  - "environment variables"
  - "env config"
  - "secrets management"
  - "monitoring"
  - "observability"
  - "alerts"
  - "release process"
  - "versioning"
  - "database migration CI"

CI/CD Skill

Core Philosophy

"Pipeline as code: repeatable, fast, and visible."

Keep CI config in version control, make steps deterministic, and fail fast on build/test/lint. Deploy only from a defined pipeline when possible.


Protocol

1. Identify System

SystemConfig locationTypical use
GitHub Actions.github/workflows/*.ymlBuild, test, lint, release, deploy
GitLab CI.gitlab-ci.ymlSame
JenkinsJenkinsfile or UISame
CircleCI.circleci/config.ymlSame
OtherRepo root or ci/, .ci/Check project README or docs

Respect existing layout and naming (e.g. build.yml, test.yml, deploy.yml).

2. Pipeline Stages

  • Build: Compile/install; produce artifacts. Cache deps when supported.
  • Test: Unit and integration; use same commands as local (e.g. npm test, pytest).
  • Lint: Linters and formatters; fail on violation or auto-fix and commit, per project policy.
  • Migrate: Run database migrations before or during deploy (see Database Migrations below).
  • Deploy: Only from main/release or tags; use secrets for credentials; prefer idempotent steps.

Add or change one job/workflow at a time; run the pipeline to verify.

3. Database Migrations in CI

StageActionConsiderations
TestRun migrations against test DBEnsure migrations are reversible; test both up and down
StagingRun migrations before code deployValidate with production-like data
ProductionRun migrations with appropriate strategyZero-downtime for critical systems

Migration commands by ecosystem:

EcosystemMigrateRollback
Node (Prisma)npx prisma migrate deploynpx prisma migrate reset (dev only)
Node (Knex)npx knex migrate:latestnpx knex migrate:rollback
Python (Alembic)alembic upgrade headalembic downgrade -1
Python (Django)python manage.py migratepython manage.py migrate <app> <migration>
Go (goose)goose upgoose down
Ruby (Rails)rails db:migraterails db:rollback

CI migration pattern:

  1. Run migrations in a separate job/step before deploy
  2. If migration fails, stop deployment
  3. For rollback: deploy previous code version, then run down migration

4. Environment & Config Management

ConcernApproach
SecretsUse platform secret store (GitHub Secrets, Vault, AWS SSM); never in code or logs
Env varsDefine in workflow YAML or platform settings; document required vars in README
Config filesUse environment-specific files (.env.production, config/prod.json) or inject at deploy
Feature flagsIntegrate with flag service (LaunchDarkly, Unleash, custom) or env vars for simple cases

Environment parity: Keep dev/staging/prod as similar as possible. Document differences (e.g. mock services in dev).

Config by environment:

yaml
# Example: GitHub Actions environment-specific deploy
jobs:
  deploy:
    environment: production # Uses production secrets
    env:
      DATABASE_URL: ${{ secrets.DATABASE_URL }}
      API_KEY: ${{ secrets.API_KEY }}

5. Observability & Monitoring

Set up observability in the pipeline and deployed application:

LayerWhat to Configure
LoggingStructured logs (JSON), log levels, correlation IDs; ship to central system (Datadog, CloudWatch, ELK)
MetricsApplication metrics (request count, latency, error rate); infrastructure metrics (CPU, memory)
TracingDistributed tracing for multi-service systems (OpenTelemetry, Jaeger, Datadog APM)
AlertsAlert on error rate spikes, latency p99, failed deployments; route to on-call

CI observability steps:

  • Add deployment events/markers to monitoring system
  • Run smoke tests post-deploy and alert on failure
  • Track deployment frequency and failure rate as metrics

Health checks: Add /health or /ready endpoints; CI can verify after deploy.

MCP (Datadog): When monitoring or observability is in scope, use the Datadog MCP (after /setup) to inspect monitors, query metrics, search logs, and check service health. Use tools such as list_monitors, get_monitor_status, query_metrics, search_logs, and get_service_health to validate alerts and deployment impact. Ensure /setup has been run so Datadog MCP is configured.

6. Release Management

AspectApproach
VersioningSemantic versioning (MAJOR.MINOR.PATCH); automate with tools like semantic-release
TaggingTag releases in git; deploy from tags for production
ChangelogAuto-generate from conventional commits or maintain manually (see git-commits skill)
Release branchesUse release/* branches for staged releases; or deploy from main with tags

Release workflow:

  1. Merge to main (or release branch)
  2. CI creates version tag (from commits or manual)
  3. CI builds release artifacts
  4. Deploy to staging → production (gated)
  5. Create GitHub Release with changelog

7. Fixing Failures

  • Read logs: Identify failing step and error message.
  • Reproduce locally: Run the same command (e.g. install, test, lint) in the same env (version, OS) when possible.
  • Fix: Fix the code or the pipeline (dependency, env var, path, permission). Prefer fixing the cause over relaxing checks (e.g. don't disable tests to make CI green).
  • Secrets: Never log or commit secrets; use the platform's secret store and reference by name.
  • Migration failures: Check for locked tables, constraint violations, or missing dependencies; test migrations in staging first.

8. Conventions

  • Use matrix or parallel jobs for multiple runtimes/versions when relevant.
  • Cache dependency installs (e.g. npm, pip, bundler) to speed runs.
  • Set explicit versions for runtimes (e.g. node-version, python-version) so runs are reproducible.
  • Document in README or docs/ci.md how to run the same steps locally.
  • Document required environment variables and how to obtain secrets.

9. Commands

  • Trigger/check: Push branch, open PR, or use "Re-run" in the CI UI.
  • Local parity: Run the same install/test/lint commands as in the workflow (see workflow YAML).
  • Release: npm version patch/minor/major, git tag v1.2.3, or use semantic-release.

Checklist

  • Pipeline config is in repo and under version control.
  • Build and test steps match local commands and versions where practical.
  • Failures are addressed by fixing cause, not by skipping or weakening checks.
  • Secrets are in secret store only; not in logs or code.
  • Deploy steps are gated (branch/tag) and use credentials from secret store.
  • Database migrations run in CI with rollback strategy documented.
  • Environment variables documented; config separated from code.
  • Observability configured: logging, metrics, alerts for deployments.
  • Release process defined: versioning, tagging, changelog generation.