ABOUTME: AWS/GCP cloud infrastructure patterns and best practices

ABOUTME: Well-Architected, security, cost optimization, observability

Cloud Infrastructure (AWS & GCP)

Quick Reference

bash

# AWS
aws sts get-caller-identity
aws --profile staging ecs list-services

# GCP
gcloud auth list
gcloud config set project PROJECT_ID

# Security scanning
trivy config . && checkov -d .

See: terraform/SKILL.md | _PATTERNS.md

AWS Well-Architected (6 Pillars)

Pillar	Key Practices
Operational Excellence	IaC, runbooks, observability, chaos engineering
Security	Least privilege IAM, GuardDuty/Security Hub, KMS encryption, SCPs
Reliability	Multi-AZ, auto-scaling, RTO/RPO backups
Performance	Right-size, caching, serverless, read replicas
Cost	Reserved/Savings Plans, Spot, tagging
Sustainability	Optimize utilization, Graviton

AWS ECS vs EKS

Factor	ECS	EKS
Complexity	Lower	Higher (K8s)
Multi-cloud	No	Yes
Cost	Free control plane	$0.10/hr/cluster

ECS Task Definition

hcl

resource "aws_ecs_task_definition" "app" {
  family                   = "app"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = 256
  memory                   = 512
  container_definitions = jsonencode([{
    name   = "app"
    image  = "${var.ecr_repo}:${var.image_tag}"
    healthCheck = { command = ["CMD-SHELL", "curl -f http://localhost/health || exit 1"] }
  }])
}

GCP Cloud Run

yaml

spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "100"
    spec:
      containerConcurrency: 80
      containers:
        - image: gcr.io/project/image
          resources: { limits: { cpu: "1", memory: "512Mi" } }

Cost Optimization

Tool	Use
Cost Explorer / Compute Optimizer	AWS analysis
Infracost	IaC cost in PRs
GCP FinOps Hub	Gemini recommendations

Compute: Reserved (72% savings), Spot (90%), Graviton, auto-scaling Storage: S3 Intelligent Tiering, lifecycle policies Networking: VPC endpoints (avoid NAT costs)

Observability (OpenTelemetry)

Why OTEL: Vendor-agnostic, unified traces/metrics/logs

yaml

receivers:
  otlp: { protocols: { grpc: { endpoint: 0.0.0.0:4317 } } }
processors:
  batch: { timeout: 1s }
exporters:
  awsxray: { region: us-east-1 }
service:
  pipelines:
    traces: { receivers: [otlp], processors: [batch], exporters: [awsxray] }

Networking

AWS VPC Design

code

VPC (10.0.0.0/16)
├── Public → ALB, NAT
├── Private → ECS/EKS, Lambda
└── Isolated → RDS (no internet)

VPC Endpoints

hcl

resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.s3"
}

Code Review Checklist

Category	Checks
Security	No hardcoded secrets, least privilege IAM, KMS encryption, logging enabled
Cost	Tagged resources, right-sized, auto-scaling
Reliability	Multi-AZ, health checks, backups

Resources

AWS: Well-Architected | Security Best Practices

GCP: Architecture Framework

Tools: Checkov | Infracost | OpenTelemetry