AWS Limits Review

Goal

Catch AWS service limit/quota violations in infrastructure code before they cause production issues.

When to use

•Reviewing Terraform, CloudFormation, CDK, or Pulumi code
•Code that provisions or configures AWS resources
•PR reviews involving AWS infrastructure changes
•Before deploying infrastructure changes to production

When not to use

•Non-AWS cloud providers (future skills: cloudflare-limits, gcp-limits)
•Application code without infrastructure components
•AWS SDK usage for read-only operations

Inputs

•Infrastructure code files (.tf, .json, .yaml, .yml, .ts, .py)
•Context about expected scale/traffic patterns (if available)

Outputs

•List of potential limit violations with severity
•Links to official AWS documentation
•Recommended mitigations

Default workflow

•
Identify AWS services in the code:
- •Scan for resource types (e.g., aws_lambda_function, AWS::Lambda::Function)
- •Note service patterns in SDK calls
•
Check against known limits in REFERENCE.md:
- •Hard limits (cannot be increased)
- •Soft limits at default values
- •Limits that commonly cause production issues
•
Flag violations with context:
- •Quote the specific code
- •Explain the limit and why it matters
- •Link to AWS documentation
- •Suggest mitigation if applicable
•
Prioritize by severity:
- •Critical: Hard limits that will fail immediately
- •High: Default limits likely to be hit at moderate scale
- •Medium: Limits that may cause issues at high scale
- •Low: Informational, worth noting for awareness

Key patterns to watch

Lambda

•Functions with >15 min expected runtime (hard limit: 15 min)
•Synchronous payloads >6MB
•VPC-connected functions without sufficient ENI capacity
•High concurrency without reserved/provisioned concurrency

API Gateway

•Integrations expecting >29s response time (default timeout)
•Missing throttling configuration for public APIs
•WebSocket connections without connection management

S3

•Single-prefix designs with high throughput expectations
•Missing retry logic for 503 responses

DynamoDB

•Items approaching 400KB
•Single-table designs without GSI capacity planning
•On-demand tables without understanding burst limits

Step Functions

•Long-running workflows approaching 25,000 event limit
•Map states without concurrency limits
•Large payloads (>256KB state data)

Load Balancing

•Multiple load balancers sharing target groups (limit: 5 LBs per target group)
•Target groups with many targets without health check tuning
•Weighted routing with many target groups per action

Cognito

•Auth flows without rate limit handling (default: 120 RPS)
•Bulk user imports without throttling

Validation checklist

• All identified AWS services checked against REFERENCE.md
• Hard limits flagged as Critical
• Soft limits noted with current defaults
• Links to AWS docs included for each finding
• Mitigations suggested where applicable

Output format

markdown

## AWS Limits Review

### Critical
- **[Service]: [Limit name]**
  - File: `path/to/file.tf:42`
  - Issue: [Description]
  - Limit: [Value] ([hard/soft])
  - Docs: [AWS link]
  - Mitigation: [Suggestion]

### High
...

### Medium
...

### Low / Informational
...

### No Issues Found
- [List of services reviewed with no concerns]

Examples

Example 1: Lambda timeout

Input: Terraform defining Lambda with Step Functions integration

hcl

resource "aws_lambda_function" "processor" {
  timeout = 900  # 15 minutes
}

Output:

High: Lambda timeout at maximum

•File: lambda.tf:12

•Issue: Lambda timeout set to maximum (900s). Any operation exceeding this will fail.

•Limit: 900 seconds (hard)

•Docs: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html

•Mitigation: For longer processes, use Step Functions with Lambda iterations or switch to Fargate.

Example 2: Multiple LBs per target group

Input: Terraform with shared target groups

hcl

resource "aws_lb_target_group" "shared" {
  name = "shared-targets"
}

resource "aws_lb_listener" "lb1" {
  default_action {
    target_group_arn = aws_lb_target_group.shared.arn
  }
}
# ... repeated for 6 load balancers

Output:

Critical: Target group shared across too many load balancers

•File: alb.tf:15-45

•Issue: Target group "shared-targets" referenced by 6 load balancers. Maximum is 5.

•Limit: 5 load balancers per target group (hard)

•Docs: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-limits.html

•Mitigation: Create separate target groups for additional load balancers.