AgentSkillsCN

infrastructure-validation

当您需要处理 Terraform(.tf、.tfvars)、Ansible(剧本、角色、清单)、Docker(Dockerfile、docker-compose.yml)、CloudFormation,或任何基础设施即代码文件时,可使用此技能——它能提供验证工作流、工具链,以及常见错误预防措施。

SKILL.md
--- frontmatter
name: infrastructure-validation
description: Use when working with Terraform (.tf, .tfvars), Ansible (playbooks, roles, inventory), Docker (Dockerfile, docker-compose.yml), CloudFormation, or any infrastructure-as-code files — provides validation workflows, tool chains, and common mistake prevention
<!-- TOKEN BUDGET: 130 lines / ~390 tokens -->

Infrastructure Validation

Activation Triggers

  • Files matching: *.tf, *.tfvars, Dockerfile, docker-compose.yml, playbook*.yml, roles/, inventory/
  • Config: .shipyard/config.json has iac_validation set to "auto" or true

Overview

IaC mistakes don't cause test failures — they cause outages, breaches, and cost overruns. Validate before every change.

Core principle: Never apply without plan review. Like TDD requires tests before code, IaC requires validation before apply.

File Detection

Files PresentWorkflow
*.tfTerraform
playbook*.yml, roles/, inventory/Ansible
Dockerfile, docker-compose.ymlDocker
Templates with AWSTemplateFormatVersionCloudFormation
YAML with apiVersion:Kubernetes

Terraform Workflow

Run in order. Each step must pass before proceeding.

code
terraform fmt -check          # 1. Format (auto-fix with fmt if needed)
terraform validate            # 2. Syntax validation
terraform plan -out=tfplan    # 3. Review every change — NEVER skip
tflint --recursive            # 4. Lint (if installed)
tfsec . OR checkov -d .       # 5. Security scan (if installed)

Drift detection: terraform plan -detailed-exitcode — exit code 2 means drift. Document what drifted and why before overwriting.

Ansible Workflow

code
yamllint .                              # 1. YAML syntax
ansible-lint                            # 2. Best practices
ansible-playbook --syntax-check *.yml   # 3. Playbook syntax
ansible-playbook --check *.yml          # 4. Dry run (where supported)
molecule test                           # 5. Role tests (if configured)

Docker Workflow

code
hadolint Dockerfile                     # 1. Lint (if installed)
docker build -t test-build .            # 2. Build
trivy image test-build                  # 3. Security scan (if installed)
docker compose config                   # 4. Validate compose (if applicable)

Common Mistakes

Terraform

MistakeFix
Local state fileUse remote backend (S3+DynamoDB, GCS)
No state lockingEnable lock table
Hardcoded secretsUse variables + secret manager
* in security groupsRestrict to specific CIDRs
Unpinned provider versionPin in required_providers
Missing tagsRequire via policy or module defaults

Ansible

MistakeFix
Plaintext secretsansible-vault encrypt
shell instead of modulesUse native modules (apt, copy, etc.)
Everything as rootbecome: false by default, escalate only when needed

Docker

MistakeFix
FROM ubuntu:latestPin to digest: FROM ubuntu:22.04@sha256:...
Running as rootAdd USER nonroot
COPY . .Use .dockerignore, copy specific files
Secrets in ENV/ARGUse build secrets or runtime injection
No health checkAdd HEALTHCHECK instruction
Single-stage buildUse multi-stage builds

Red Flags — STOP

  • terraform apply -auto-approve without prior plan review
  • Security group with 0.0.0.0/0 on non-HTTP ports
  • IAM policy with * action or * resource
  • Secrets in .tf, .yml, or Dockerfile
  • State file committed to git
  • latest tag on any base image
  • Container running as root in production

Integration

Referenced by: shipyard:builder (detects IaC files, follows appropriate workflow), shipyard:verifier (IaC validation mode), shipyard:auditor (IaC security checks)

Pairs with: shipyard:security-audit (security lens for IaC), shipyard:shipyard-verification (IaC claims need validation evidence)