AWS Limits Review
Goal
Catch AWS service limit/quota violations in infrastructure code before they cause production issues.
When to use
- •Reviewing Terraform, CloudFormation, CDK, or Pulumi code
- •Code that provisions or configures AWS resources
- •PR reviews involving AWS infrastructure changes
- •Before deploying infrastructure changes to production
When not to use
- •Non-AWS cloud providers (future skills: cloudflare-limits, gcp-limits)
- •Application code without infrastructure components
- •AWS SDK usage for read-only operations
Inputs
- •Infrastructure code files (
.tf,.json,.yaml,.yml,.ts,.py) - •Context about expected scale/traffic patterns (if available)
Outputs
- •List of potential limit violations with severity
- •Links to official AWS documentation
- •Recommended mitigations
Default workflow
- •
Identify AWS services in the code:
- •Scan for resource types (e.g.,
aws_lambda_function,AWS::Lambda::Function) - •Note service patterns in SDK calls
- •Scan for resource types (e.g.,
- •
Check against known limits in REFERENCE.md:
- •Hard limits (cannot be increased)
- •Soft limits at default values
- •Limits that commonly cause production issues
- •
Flag violations with context:
- •Quote the specific code
- •Explain the limit and why it matters
- •Link to AWS documentation
- •Suggest mitigation if applicable
- •
Prioritize by severity:
- •Critical: Hard limits that will fail immediately
- •High: Default limits likely to be hit at moderate scale
- •Medium: Limits that may cause issues at high scale
- •Low: Informational, worth noting for awareness
Key patterns to watch
Lambda
- •Functions with >15 min expected runtime (hard limit: 15 min)
- •Synchronous payloads >6MB
- •VPC-connected functions without sufficient ENI capacity
- •High concurrency without reserved/provisioned concurrency
API Gateway
- •Integrations expecting >29s response time (default timeout)
- •Missing throttling configuration for public APIs
- •WebSocket connections without connection management
S3
- •Single-prefix designs with high throughput expectations
- •Missing retry logic for 503 responses
DynamoDB
- •Items approaching 400KB
- •Single-table designs without GSI capacity planning
- •On-demand tables without understanding burst limits
Step Functions
- •Long-running workflows approaching 25,000 event limit
- •Map states without concurrency limits
- •Large payloads (>256KB state data)
Load Balancing
- •Multiple load balancers sharing target groups (limit: 5 LBs per target group)
- •Target groups with many targets without health check tuning
- •Weighted routing with many target groups per action
Cognito
- •Auth flows without rate limit handling (default: 120 RPS)
- •Bulk user imports without throttling
Validation checklist
- • All identified AWS services checked against REFERENCE.md
- • Hard limits flagged as Critical
- • Soft limits noted with current defaults
- • Links to AWS docs included for each finding
- • Mitigations suggested where applicable
Output format
markdown
## AWS Limits Review ### Critical - **[Service]: [Limit name]** - File: `path/to/file.tf:42` - Issue: [Description] - Limit: [Value] ([hard/soft]) - Docs: [AWS link] - Mitigation: [Suggestion] ### High ... ### Medium ... ### Low / Informational ... ### No Issues Found - [List of services reviewed with no concerns]
Examples
Example 1: Lambda timeout
Input: Terraform defining Lambda with Step Functions integration
hcl
resource "aws_lambda_function" "processor" {
timeout = 900 # 15 minutes
}
Output:
High: Lambda timeout at maximum
- •File:
lambda.tf:12- •Issue: Lambda timeout set to maximum (900s). Any operation exceeding this will fail.
- •Limit: 900 seconds (hard)
- •Docs: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
- •Mitigation: For longer processes, use Step Functions with Lambda iterations or switch to Fargate.
Example 2: Multiple LBs per target group
Input: Terraform with shared target groups
hcl
resource "aws_lb_target_group" "shared" {
name = "shared-targets"
}
resource "aws_lb_listener" "lb1" {
default_action {
target_group_arn = aws_lb_target_group.shared.arn
}
}
# ... repeated for 6 load balancers
Output:
Critical: Target group shared across too many load balancers
- •File:
alb.tf:15-45- •Issue: Target group "shared-targets" referenced by 6 load balancers. Maximum is 5.
- •Limit: 5 load balancers per target group (hard)
- •Docs: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-limits.html
- •Mitigation: Create separate target groups for additional load balancers.