Cloud Infrastructure
Comprehensive cloud infrastructure skill covering multi-cloud architecture, Infrastructure as Code, cost optimization, and production deployment patterns.
When to Use This Skill
- •Designing cloud architecture for new applications
- •Implementing Infrastructure as Code (Terraform, CloudFormation, Pulumi)
- •Cost optimization and resource right-sizing
- •Multi-region and high-availability deployments
- •Cloud migration planning
- •Security and compliance implementation
- •Auto-scaling and performance optimization
Cloud Architecture Patterns
Compute Patterns
| Pattern | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Serverless | Lambda | Functions | Cloud Functions | Event-driven, variable load |
| Containers | ECS/EKS | AKS | GKE | Microservices, consistent env |
| VMs | EC2 | Virtual Machines | Compute Engine | Legacy apps, full control |
| Batch | Batch | Batch | Batch | Large-scale processing |
Storage Patterns
| Type | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Object | S3 | Blob Storage | Cloud Storage | Static files, backups |
| Block | EBS | Managed Disks | Persistent Disk | Database storage |
| File | EFS | Azure Files | Filestore | Shared file systems |
| Archive | Glacier | Archive | Coldline | Long-term retention |
Database Patterns
| Type | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Relational | RDS, Aurora | SQL Database | Cloud SQL | ACID transactions |
| NoSQL | DynamoDB | Cosmos DB | Firestore | Flexible schema |
| Cache | ElastiCache | Cache for Redis | Memorystore | Session, caching |
| Data Warehouse | Redshift | Synapse | BigQuery | Analytics |
Infrastructure as Code
Terraform Best Practices
Project Structure:
code
infrastructure/ ├── modules/ │ ├── networking/ │ ├── compute/ │ └── database/ ├── environments/ │ ├── dev/ │ ├── staging/ │ └── prod/ ├── main.tf ├── variables.tf ├── outputs.tf └── versions.tf
State Management:
- •Use remote state (S3, Azure Blob, GCS)
- •Enable state locking (DynamoDB, Blob lease)
- •Separate state per environment
- •Never commit state files
Module Design:
- •Single responsibility per module
- •Expose minimal required variables
- •Document inputs/outputs
- •Version modules with git tags
Cost Optimization
Compute Savings:
- •Reserved Instances (1-3 year commitment): 30-60% savings
- •Spot/Preemptible instances: 60-90% savings for interruptible workloads
- •Right-sizing: Match instance size to actual usage
- •Auto-scaling: Scale down during low usage
Storage Savings:
- •Lifecycle policies: Auto-transition to cheaper tiers
- •Compression: Reduce storage footprint
- •Deduplication: Eliminate redundant data
- •Delete unused resources: Orphaned volumes, snapshots
Network Savings:
- •Use CDN for static content
- •Optimize data transfer paths
- •Use private endpoints
- •Compress API responses
High Availability Patterns
Multi-AZ Deployment
- •Deploy across 2-3 availability zones
- •Use load balancers for distribution
- •Database replication across AZs
- •Automatic failover configuration
Multi-Region Deployment
- •Active-active or active-passive
- •DNS-based routing (Route53, Traffic Manager)
- •Data replication strategy
- •Disaster recovery procedures
Resilience Patterns
- •Circuit breakers for external dependencies
- •Retry with exponential backoff
- •Bulkhead isolation
- •Graceful degradation
Security Best Practices
Identity & Access
- •Principle of least privilege
- •Use IAM roles, not long-term credentials
- •Enable MFA for privileged accounts
- •Regular access reviews
Network Security
- •VPC/VNet isolation
- •Security groups as firewalls
- •Private subnets for backend services
- •VPN/Direct Connect for hybrid
Data Protection
- •Encryption at rest (KMS)
- •Encryption in transit (TLS)
- •Key rotation policies
- •Backup and recovery testing
Monitoring & Observability
Key Metrics
- •CPU, Memory, Disk utilization
- •Network throughput and latency
- •Error rates and types
- •Cost per service/team
Alerting Strategy
- •Set thresholds based on baselines
- •Alert on symptoms, not causes
- •Runbooks for each alert
- •Escalation paths defined
Reference Files
- •
references/terraform_patterns.md- IaC patterns and examples - •
references/cost_optimization.md- Detailed cost reduction strategies
Integration with Other Skills
- •security-engineering - For security architecture
- •network-engineering - For network design
- •performance - For optimization strategies
- •devops-runbooks - For operational procedures