AgentSkillsCN

platform-engineering

具备平台工程的专业知识。在构建内部开发者平台、自助式基础设施、黄金路径、Backstage门户、Crossplane组合、ArgoCD配置、Tekton流水线,或实现多租户抽象时,可使用此功能。

SKILL.md
--- frontmatter
name: platform-engineering
description: >
  Platform engineering expertise. Use when building internal developer
  platforms, self-service infrastructure, golden paths, Backstage portals,
  Crossplane compositions, ArgoCD configurations, Tekton pipelines,
  or multi-tenancy abstractions.

Platform Engineering Principles

Golden Paths

  • Provide opinionated defaults that cover 80% of use cases
  • Make the right thing the easy thing — secure, observable, compliant by default
  • Allow escape hatches for the 20% that need customisation
  • Version golden paths and communicate breaking changes

Self-Service Abstractions

  • Platform teams own Compositions/XRDs (Crossplane) or Helm library charts
  • Application teams consume Claims/values files
  • Hide infrastructure complexity behind simple interfaces:
    yaml
    # What the app team writes:
    apiVersion: platform.example.com/v1alpha1
    kind: Application
    spec:
      name: my-service
      team: payments
      tier: production
      replicas: 3
    
  • Validate inputs with CEL or webhook admission

GitOps

  • ArgoCD or Flux for continuous delivery
  • App-of-apps pattern for managing multiple services
  • ApplicationSets for templated multi-cluster/multi-env deployment
  • Separate repos: application code vs deployment manifests
  • Promotion: dev → staging → production via PR-based promotion

CI/CD

  • Tekton Pipelines for Kubernetes-native CI
  • GitHub Actions for source-level CI (lint, test, build)
  • Image builds: ko for Go, buildah for general containers
  • Image signing with Sigstore/cosign
  • SBOM generation with syft
  • Vulnerability scanning with trivy or grype

Multi-Tenancy

  • Namespace-per-team with ResourceQuotas and LimitRanges
  • NetworkPolicies for namespace isolation
  • RBAC: team-scoped Roles, platform-scoped ClusterRoles
  • Hierarchical Namespaces (HNC) for delegation
  • Capsule or vCluster for stronger isolation

Observability Stack

  • Metrics: Prometheus + Thanos/Cortex for long-term storage
  • Logs: OpenTelemetry Collector → Loki or CloudWatch/Azure Monitor
  • Traces: OpenTelemetry SDK → Tempo or Jaeger
  • Dashboards: Grafana with provisioned dashboards-as-code
  • SLO monitoring: Pyrra or Sloth for SLI/SLO management

Developer Experience

  • Backstage for service catalog and developer portal
  • Scaffolder templates for new service creation
  • TechDocs for documentation-as-code
  • Score or similar for environment-agnostic workload specs