GKE Cost Optimization
Purpose
Reduce GKE spending while maintaining performance and reliability. This skill covers cost analysis, resource right-sizing, cluster mode comparison, and budget monitoring strategies.
When to Use
Use this skill when you need to:
- •Analyze GKE spending and identify cost optimization opportunities
- •Compare Autopilot vs Standard billing models for your workload
- •Right-size pod resource requests and limits
- •Configure Horizontal/Vertical Pod Autoscaling to reduce waste
- •Set up Spot VMs for batch or non-critical workloads
- •Create budget alerts and track cost per service
- •Optimize resource utilization
Trigger phrases: "optimize GKE costs", "reduce Kubernetes spending", "right-size resources", "Autopilot vs Standard pricing", "GKE budget alerts"
Table of Contents
Quick Start
Analyze and optimize costs in three steps:
# 1. Check current resource usage kubectl top pods -n wtr-supplier-charges # 2. Analyze resource requests vs actual usage kubectl describe deployment supplier-charges-hub -n wtr-supplier-charges | grep -A 10 "resources:" # 3. Compare Autopilot vs Standard pricing for your workload # Autopilot: Pay per pod per second for requests # Standard: Pay per provisioned node per hour (regardless of usage)
Instructions
Step 1: Understand Your Billing Model
GKE Autopilot (Per-Pod Billing - Recommended)
Cost Calculation:
Monthly Cost = (vCPU requests * $0.04 + Memory requests GB * $0.004 + Disk GB * $0.0001) * seconds per month
Example for Supplier Charges Hub:
- •2 replicas, each requesting: 1 vCPU + 2 GB memory
- •Monthly cost ≈ (2 * 1 * $0.04 + 2 * 2 * $0.004) * 2.592M seconds ≈ $260
Advantages:
- •Pay only for what pods request (not what entire cluster capacity is)
- •No idle resource costs
- •Perfect for variable workloads (scales up/down automatically)
- •Up to 60% cheaper than Standard for typical workloads
GKE Standard (Node Billing)
Cost Calculation:
Monthly Cost = Number of nodes * Machine type hourly rate * 730 hours per month
Example for Supplier Charges Hub:
- •3
n2-standard-4nodes (standard pool) = ~$400/month - •Even if pods use only 30% of capacity, you pay for 100%
Advantages:
- •Predictable costs (great for committed use discounts)
- •Full control over infrastructure
- •Better for stable, high-utilization workloads
Step 2: Analyze Current Resource Usage
Check if pods are over-provisioned:
# View actual vs requested resources
kubectl top pods -n wtr-supplier-charges -o wide
# Compare to requests
kubectl get pods -n wtr-supplier-charges -o jsonpath='{.items[*].spec.containers[*].resources.requests}'
Analysis Questions:
- •Are actual values significantly lower than requests?
- •Is memory usage consistently below 75% of limits?
- •Is CPU usage consistently below 70% of requests?
If Yes → Right-size resources (reduce requests)
Step 3: Right-Size Pod Resources
Use Vertical Pod Autoscaler (VPA) recommendations:
# Apply VPA to deployment
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: supplier-charges-hub-vpa
namespace: wtr-supplier-charges
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: supplier-charges-hub
updatePolicy:
updateMode: "Off" # Only provide recommendations, don't auto-update
resourcePolicy:
containerPolicies:
- containerName: supplier-charges-hub-container
minAllowed:
cpu: 500m
memory: 1Gi
maxAllowed:
cpu: 2
memory: 4Gi
EOF
# Wait 1 week for data collection, then view recommendations
kubectl describe vpa supplier-charges-hub-vpa -n wtr-supplier-charges | grep -A 20 "Recommendation"
Recommended VPA Values:
resources:
requests:
cpu: 500m # Reduced from 1000m if usage averages 300m
memory: 1.5Gi # Reduced from 2Gi if usage averages 1Gi
limits:
cpu: 500m # Match requests for Guaranteed QoS
memory: 1.5Gi
Step 4: Configure Horizontal Pod Autoscaling (HPA)
Let Kubernetes scale replicas based on demand:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: supplier-charges-hub-hpa
namespace: wtr-supplier-charges
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: supplier-charges-hub
minReplicas: 1 # Scale down to 1 during low traffic
maxReplicas: 5 # Scale up to 5 during peak
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
Cost Savings: Reduces off-peak replicas from 2 to 1 = ~30% savings if traffic varies.
Step 5: Use Spot VMs (GKE Standard Only)
For non-critical or fault-tolerant workloads, use Spot VMs (91% discount):
# Create Spot node pool gcloud container node-pools create spot-pool \ --cluster=shared-gke-standard-01-euw2 \ --region=europe-west2 \ --spot \ --machine-type=n2-standard-4 \ --num-nodes=2 \ --enable-autoscaling \ --min-nodes=0 \ --max-nodes=10 # Add toleration to workloads that can run on Spot tolerations: - key: cloud.google.com/gke-spot operator: Equal value: "true" effect: NoSchedule
Use Cases: Batch processing, CI/CD build workers, non-critical background jobs
Not for: Production APIs (Supplier Charges Hub should NOT use Spot)
Step 6: Set Up Cost Monitoring
Create budget alerts:
# Create budget with 50%, 90%, 100% thresholds gcloud billing budgets create \ --billing-account=BILLING_ACCOUNT_ID \ --display-name="GKE Labs Environment" \ --budget-amount=500USD \ --threshold-rule=percent=50 \ --threshold-rule=percent=90 \ --threshold-rule=percent=100 \ --notification-rule-name=email-alert
Step 7: Analyze Cost by Namespace/Service
Export billing data for detailed analysis:
# Tag namespace for cost tracking kubectl label namespace wtr-supplier-charges \ cost-center=supplier-charges \ environment=labs # Later, filter GCP Billing reports by labels # In Cloud Console: Billing → Reports → Filter by labels
Examples
Example 1: Optimize Supplier Charges Hub Deployment
#!/bin/bash
# Step-by-step optimization
DEPLOYMENT="supplier-charges-hub"
NAMESPACE="wtr-supplier-charges"
echo "=== GKE Cost Optimization ==="
echo ""
echo "1. Current Resource Usage:"
kubectl top pods -l app=$DEPLOYMENT -n $NAMESPACE
echo ""
echo "2. Current Resource Requests:"
kubectl get deployment $DEPLOYMENT -n $NAMESPACE \
-o jsonpath='{.spec.template.spec.containers[0].resources}'
echo ""
echo "3. Applying VPA for recommendations:"
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: ${DEPLOYMENT}-vpa
namespace: $NAMESPACE
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: $DEPLOYMENT
updatePolicy:
updateMode: "Off"
EOF
echo "VPA created. Wait 1 week for recommendations."
echo ""
echo "4. Applying HPA for automatic scaling:"
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ${DEPLOYMENT}-hpa
namespace: $NAMESPACE
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: $DEPLOYMENT
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
EOF
echo "HPA created. Replicas will auto-scale based on demand."
echo ""
echo "5. Estimated Savings:"
echo " - Autopilot: ~$260/month (current)"
echo " - With HPA (avg 1.5 replicas): ~$195/month"
echo " - Savings: ~$65/month (25%)"
Example 2: Autopilot vs Standard Cost Comparison
#!/bin/bash
# Compare pricing for your workload
# Supplier Charges Hub specifications
REPLICAS=2
CPU_PER_POD="1000m"
MEMORY_PER_POD="2Gi"
DISK_PER_POD="10Gi"
# Autopilot pricing (europe-west2)
CPU_PRICE=0.04
MEMORY_PRICE=0.004
DISK_PRICE=0.0001
SECONDS_PER_MONTH=2592000
TOTAL_CPU=$(echo "$REPLICAS * 1" | bc)
TOTAL_MEMORY=$(echo "$REPLICAS * 2" | bc)
TOTAL_DISK=$(echo "$REPLICAS * 10" | bc)
AUTOPILOT_COST=$(echo "($TOTAL_CPU * $CPU_PRICE + $TOTAL_MEMORY * $MEMORY_PRICE + $TOTAL_DISK * $DISK_PRICE) * $SECONDS_PER_MONTH / 3600 / 24 / 30" | bc)
echo "=== Cost Comparison: Autopilot vs Standard ==="
echo ""
echo "Workload: $REPLICAS replicas, ${CPU_PER_POD}m CPU, ${MEMORY_PER_POD} memory each"
echo ""
echo "AUTOPILOT (Per-Pod Billing):"
echo " Monthly cost: ~\$${AUTOPILOT_COST}"
echo ""
echo "STANDARD (Node-Based Billing):"
echo " 3x n2-standard-4 nodes @ \$400/month"
echo " 3x cloud-sql-proxy sidecars @ \$50/month"
echo " Total: ~\$450/month"
echo ""
echo "SAVINGS with Autopilot: ~\$190/month (42%)"
Example 3: Set Up Cost Tracking and Alerts
#!/bin/bash # Set up cost monitoring BILLING_ACCOUNT="000000-A1B2C3-D4E5F6" # Replace with your billing account NAMESPACE="wtr-supplier-charges" PROJECT_ID="ecp-wtr-supplier-charges-labs" echo "=== Setting Up Cost Monitoring ===" echo "" echo "1. Tag resources for cost tracking" kubectl label namespace $NAMESPACE \ cost-center=supplier-charges \ environment=labs \ --overwrite echo "" echo "2. Create billing budget" gcloud billing budgets create \ --billing-account=$BILLING_ACCOUNT \ --display-name="GKE Labs - Supplier Charges Hub" \ --budget-amount=500USD \ --threshold-rule=percent=50 \ --threshold-rule=percent=90 \ --threshold-rule=percent=100 \ --notification-rule-name="Budget Alert" echo "" echo "3. Enable Cost Analysis" echo " Go to: https://console.cloud.google.com/billing/reports" echo " Filter by: labels.cost-center = supplier-charges" echo "" echo "4. Track GKE Costs" echo " Go to: https://console.cloud.google.com/monitoring/dashboards" echo " Create dashboard filtering: resource.type=k8s_cluster AND resource.labels.cluster_name"
Requirements
- •GKE cluster running (Autopilot or Standard)
- •
kubectlaccess to view pod metrics and deployments - •
gcloudCLI access for billing and monitoring setup - •Metrics server enabled (usually on by default)
- •For billing: access to GCP billing account
See Also
- •gcp-gke-cluster-setup - Understand cluster modes and their costs
- •gcp-gke-deployment-strategies - Optimize deployment replicas
- •gcp-gke-monitoring-observability - Monitor resource utilization