Cross-Tool Integration Pattern: AWS Infrastructure + Documentation

Pattern Type: Infrastructure Deployment Tools: aws-api, aws-docs Confidence: HIGH (0.90) - Production-validated pattern Primary Users: aws-expert, data-engineer-role, frontend-developer-role, data-architect-role

Problem Statement

Scenario: Deploy new AWS service or update existing infrastructure Challenge: AWS documentation changes frequently, training data becomes outdated Goal: Deploy infrastructure following CURRENT best practices and service limits

Integration Pattern Overview

Tool Coordination Strategy

code

aws-docs → Search for current best practices
    ↓
aws-docs → Read specific service documentation
    ↓
aws-docs → Get related content (new features, security)
    ↓
aws-api → Validate existing infrastructure state
    ↓
aws-expert → Design infrastructure based on current docs
    ↓
HUMAN → Implement changes (aws-api is read-only)

Critical Insight: aws-docs provides CURRENT documentation (post-training cutoff)

•Service limits may have increased since training
•New features released after Jan 2025
•Updated security best practices
•Changed API parameters

Why both tools needed:

•aws-docs: CURRENT best practices, limits, features (not just training data)
•aws-api: Existing infrastructure state, validation
•Together: Informed decisions based on latest AWS knowledge + current state

Step-by-Step Workflow

Step 1: Documentation Search (aws-docs)

Search for service best practices:

bash

# 1. Search for current best practices
mcp__aws-docs__search_documentation \
  search_phrase="ECS Fargate deployment best practices 2025" \
  limit=10

Returns:

•Top 10 documentation pages ranked by relevance
•URLs to official AWS documentation
•Brief context snippets for each result

Review results for:

•Official best practices guides
•Service launch announcements (new features)
•Security recommendations
•Performance optimization guides

Step 2: Read Best Practices (aws-docs)

Read specific documentation page:

bash

# 2. Read best practices guide
mcp__aws-docs__read_documentation \
  url="https://docs.aws.amazon.com/ecs/latest/bestpracticesguide/intro.html" \
  max_length=5000 \
  start_index=0

If document is long (truncated):

bash

# 3. Continue reading from where we left off
mcp__aws-docs__read_documentation \
  url="https://docs.aws.amazon.com/ecs/latest/bestpracticesguide/intro.html" \
  max_length=5000 \
  start_index=5000

Extract key information:

•Task definition best practices (CPU/memory sizing)
•Network configuration (VPC, security groups, ALB)
•IAM role requirements (task role vs execution role)
•Logging and monitoring recommendations
•Auto-scaling strategies
•Cost optimization tips

Step 3: Get Related Documentation (aws-docs)

Discover related content:

bash

# 4. Get recommendations for related documentation
mcp__aws-docs__recommend \
  url="https://docs.aws.amazon.com/ecs/latest/bestpracticesguide/intro.html"

Returns 4 types of recommendations:

•Highly Rated: Popular pages within ECS (frequently viewed)
•New: Recently added pages (NEW FEATURES after training cutoff)
•Similar: Related topics (security, networking, troubleshooting)
•Journey: Commonly viewed next (typical learning path)

Check "New" recommendations for:

•Features released after Jan 2025
•Updated service limits
•New integration patterns
•Recent security updates

Step 4: Service Limits Verification (aws-docs)

Search for current service limits:

bash

# 5. Search for service quotas and limits
mcp__aws-docs__search_documentation \
  search_phrase="ECS service quotas limits" \
  limit=5

Read service limits page:

bash

# 6. Read current service limits
mcp__aws-docs__read_documentation \
  url="https://docs.aws.amazon.com/ecs/latest/developerguide/service-quotas.html" \
  max_length=5000

Why critical:

•Service limits may have increased since training (e.g., tasks per service: 100 → 1000)
•New quotas added for new features
•Regional differences in limits
•Soft limits vs hard limits

Step 5: Validate Current Infrastructure (aws-api)

Check current AWS account/region:

bash

# 7. Verify AWS account and credentials
mcp__aws-api__call_aws \
  cli_command="aws sts get-caller-identity"

List existing ECS clusters:

bash

# 8. List ECS clusters in current region
mcp__aws-api__call_aws \
  cli_command="aws ecs list-clusters --region us-west-2"

Get cluster details:

bash

# 9. Describe specific cluster
mcp__aws-api__call_aws \
  cli_command="aws ecs describe-clusters --clusters my-cluster --region us-west-2"

List services in cluster:

bash

# 10. List services
mcp__aws-api__call_aws \
  cli_command="aws ecs list-services --cluster my-cluster --region us-west-2"

Get service details:

bash

# 11. Describe specific service
mcp__aws-api__call_aws \
  cli_command="aws ecs describe-services --cluster my-cluster --services my-service --region us-west-2"

Step 6: Design Deployment Strategy (aws-expert)

Delegate to specialist:

markdown

DELEGATE TO: aws-expert

CONTEXT:
- Task: Deploy new ECS Fargate service for data pipeline
- Current State:
  - Cluster: analytics-cluster (existing, us-west-2)
  - Services: 3 existing services running
  - VPC: Existing (private subnets, NAT gateway)
  - ALB: Existing (HTTPS listener with OIDC auth)

- Documentation Research (from aws-docs):
  - Best Practices: Task size 0.5 vCPU / 1 GB minimum for Python apps
  - Network: Use awsvpc network mode with private subnets
  - IAM: Separate task role (permissions) from execution role (ECR/CloudWatch)
  - Logging: CloudWatch Logs with log group per service
  - Auto-scaling: Target tracking on CPU (70% threshold)
  - New Feature (2025): ECS Exec enabled for debugging (post-training)

- Current Service Limits (verified with aws-docs):
  - Tasks per service: 5,000 (increased from 1,000 in training)
  - Services per cluster: 5,000 (no change)
  - Task definition size: 64 KB (increased from 32 KB)

- Requirements:
  - Service: data-ingestion-pipeline
  - Task: Python application (dlthub connector)
  - CPU: 0.5 vCPU
  - Memory: 1 GB
  - Schedule: Run every hour (EventBridge trigger)
  - Logs: CloudWatch Logs (7-day retention)
  - Network: Private subnet with NAT for internet access

- Constraints:
  - Must use existing VPC and ALB
  - Cost-optimized (use Fargate Spot if possible)
  - No public IP exposure
  - IAM least-privilege principle

REQUEST: "Infrastructure deployment recommendations based on CURRENT AWS docs and best practices"

Specialist provides:

•Task definition (JSON with current best practices)
•Service definition (launch type, network config, auto-scaling)
•IAM roles (task role, execution role with least privilege)
•CloudWatch configuration (log groups, metrics, alarms)
•Cost optimization (Fargate Spot, right-sizing)
•Security configuration (network isolation, secrets management)

Step 7: Validation Before Deployment (aws-api)

Verify prerequisites exist:

bash

# 12. Check VPC and subnets
mcp__aws-api__call_aws \
  cli_command="aws ec2 describe-subnets --filters Name=tag:Name,Values=*private* --region us-west-2"

# 13. Check security groups
mcp__aws-api__call_aws \
  cli_command="aws ec2 describe-security-groups --filters Name=vpc-id,Values=vpc-12345 --region us-west-2"

# 14. Check IAM roles exist
mcp__aws-api__call_aws \
  cli_command="aws iam get-role --role-name ecsTaskExecutionRole"

# 15. Check ECR repository
mcp__aws-api__call_aws \
  cli_command="aws ecr describe-repositories --repository-names data-ingestion-pipeline --region us-west-2"

Get latest Docker image tag:

bash

# 16. List ECR images
mcp__aws-api__call_aws \
  cli_command="aws ecr list-images --repository-name data-ingestion-pipeline --region us-west-2"

Step 8: Deployment (HUMAN Implementation)

Why human implements:

•aws-api MCP has READ_OPERATIONS_ONLY=true (safety restriction)
•Write operations (Create/Update/Delete) blocked in MCP
•
aws-expert provides recommendations, human implements via:
- •AWS Console (visual interface)
- •AWS CLI directly (outside MCP)
- •CloudFormation/Terraform (IaC)

Deployment checklist (provided by aws-expert):

bash

# HUMAN executes these commands (outside MCP):

# 1. Register task definition
aws ecs register-task-definition --cli-input-json file://task-definition.json --region us-west-2

# 2. Create service
aws ecs create-service --cli-input-json file://service-definition.json --region us-west-2

# 3. Create CloudWatch alarms
aws cloudwatch put-metric-alarm --cli-input-json file://cpu-alarm.json --region us-west-2

# 4. Create EventBridge rule (hourly trigger)
aws events put-rule --name hourly-ingestion --schedule-expression "rate(1 hour)" --region us-west-2

# 5. Add ECS task as target
aws events put-targets --rule hourly-ingestion --targets file://ecs-target.json --region us-west-2

Step 9: Post-Deployment Validation (aws-api)

Verify service created:

bash

# 17. Check service status
mcp__aws-api__call_aws \
  cli_command="aws ecs describe-services --cluster analytics-cluster --services data-ingestion-pipeline --region us-west-2"

Check task is running:

bash

# 18. List running tasks
mcp__aws-api__call_aws \
  cli_command="aws ecs list-tasks --cluster analytics-cluster --service-name data-ingestion-pipeline --region us-west-2"

# 19. Describe task details
mcp__aws-api__call_aws \
  cli_command="aws ecs describe-tasks --cluster analytics-cluster --tasks [task-arn] --region us-west-2"

Verify CloudWatch logs:

bash

# 20. Check log streams exist
mcp__aws-api__call_aws \
  cli_command="aws logs describe-log-streams --log-group-name /ecs/data-ingestion-pipeline --region us-west-2"

Real-World Example

Scenario: Deploy Customer Dashboard React App to ECS

Step 1: Documentation Research (aws-docs)

bash

# Search for React app deployment
mcp__aws-docs__search_documentation \
  search_phrase="ECS Fargate React application deployment NGINX" \
  limit=5

# Read best practices
mcp__aws-docs__read_documentation \
  url="https://docs.aws.amazon.com/ecs/latest/bestpracticesguide/application.html"

# Get new features
mcp__aws-docs__recommend \
  url="https://docs.aws.amazon.com/ecs/latest/developerguide/Welcome.html"

Key findings from docs:

•Use NGINX as reverse proxy (lightweight, production-ready)
•Task sizing: 0.25 vCPU / 0.5 GB for NGINX + static React
•Health checks: ALB health check on /health endpoint
•Logging: NGINX access logs to CloudWatch
•New (2025): ECS Service Connect for service mesh (post-training feature)

Step 2: Current Infrastructure (aws-api)

bash

# Check existing resources
mcp__aws-api__call_aws cli_command="aws ecs describe-clusters --clusters app-cluster --region us-west-2"
mcp__aws-api__call_aws cli_command="aws elbv2 describe-load-balancers --region us-west-2"
mcp__aws-api__call_aws cli_command="aws ecr describe-repositories --repository-names customer-dashboard --region us-west-2"

Findings:

•Cluster: app-cluster (exists)
•ALB: app-alb with OIDC auth (exists)
•ECR: customer-dashboard:latest (exists)
•VPC: Private subnets with NAT (exists)

Step 3: Deployment Design (aws-expert) Based on CURRENT docs + existing infrastructure:

Task Definition:

json

{
  "family": "customer-dashboard",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",  // 0.25 vCPU (docs recommend for NGINX + static)
  "memory": "512",  // 0.5 GB (docs recommend minimum)
  "executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::account:role/salesJournalTaskRole",
  "containerDefinitions": [{
    "name": "customer-dashboard",
    "image": "account.dkr.ecr.us-west-2.amazonaws.com/customer-dashboard:latest",
    "portMappings": [{"containerPort": 80, "protocol": "tcp"}],
    "healthCheck": {
      "command": ["CMD-SHELL", "curl -f http://localhost/health || exit 1"],
      "interval": 30,
      "timeout": 5,
      "retries": 3
    },
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/customer-dashboard",
        "awslogs-region": "us-west-2",
        "awslogs-stream-prefix": "ecs"
      }
    }
  }]
}

Service Definition (includes new 2025 features):

•Launch type: FARGATE
•Network: Private subnets
•Load balancer: app-alb (OIDC auth)
•Auto-scaling: Target tracking on CPU (70%)
•NEW: ECS Service Connect enabled (service mesh, post-training)

Step 4: Deployment (HUMAN)

bash

# Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json

# Create service with Service Connect (NEW 2025 feature)
aws ecs create-service \
  --cluster app-cluster \
  --service-name customer-dashboard \
  --task-definition customer-dashboard \
  --desired-count 2 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx]}" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=customer-dashboard,containerPort=80" \
  --service-connect-configuration "enabled=true,namespace=app-services"

Step 5: Validation (aws-api)

bash

# Check service status
mcp__aws-api__call_aws \
  cli_command="aws ecs describe-services --cluster app-cluster --services customer-dashboard --region us-west-2"

# Verify tasks running
mcp__aws-api__call_aws \
  cli_command="aws ecs list-tasks --cluster app-cluster --service-name customer-dashboard --region us-west-2"

# Check target health (ALB)
mcp__aws-api__call_aws \
  cli_command="aws elbv2 describe-target-health --target-group-arn [tg-arn] --region us-west-2"

Result: Service deployed successfully using CURRENT 2025 best practices (ECS Service Connect, right-sized tasks, proper health checks)

Why aws-docs Currency is CRITICAL

Example 1: Service Limits Increase

Training Data (Jan 2025):

•ECS tasks per service: 1,000 max

Current AWS Docs (Oct 2025):

•ECS tasks per service: 5,000 max (5x increase!)

Impact: Without aws-docs, would design around outdated 1,000 limit

Example 2: New Features Released

Training Data (Jan 2025):

•No knowledge of ECS Service Connect (released Feb 2025)

Current AWS Docs (Oct 2025):

•ECS Service Connect documented in "New" recommendations

Impact: Without aws-docs, would miss modern service mesh feature

Example 3: Security Best Practices Update

Training Data (Jan 2025):

•IMDSv1 acceptable for ECS tasks

Current AWS Docs (Oct 2025):

•IMDSv2 required for all new ECS tasks (security hardening)

Impact: Without aws-docs, would deploy with outdated security config

Tool-Specific Responsibilities

aws-docs Responsibilities

•✅ CURRENT best practices (post-training updates)
•✅ Service limits (current quotas, not outdated limits)
•✅ New features (released after training)
•✅ Security recommendations (latest threat landscape)
•✅ API parameters (new parameters added)
•✅ Related content discovery (new guides, features)
•❌ Infrastructure state (delegate to aws-api)
•❌ Resource inventory (delegate to aws-api)

aws-api Responsibilities

•✅ Infrastructure inventory (existing resources)
•✅ Resource details (configurations, state)
•✅ Validation queries (verify prerequisites)
•✅ Post-deployment validation (confirm success)
•❌ Write operations (READ_ONLY mode for safety)
•❌ Documentation lookup (delegate to aws-docs)

Why Both Tools Required

•aws-docs alone: Current knowledge but no infrastructure state
•aws-api alone: Infrastructure state but outdated best practices
•Together: Informed decisions based on latest AWS knowledge + current state

Common Deployment Patterns

Pattern 1: Documentation-First Deployment

code

aws-docs (search) → aws-docs (read) → aws-docs (recommend) →
aws-api (validate) → aws-expert (design) → HUMAN (deploy) →
aws-api (verify)

When: New service deployment, unfamiliar AWS service Benefit: Follow current best practices, avoid outdated patterns

Pattern 2: New Feature Adoption

code

aws-docs (service welcome page) → aws-docs (recommend "New") →
aws-docs (read new feature docs) → aws-expert (evaluate) →
HUMAN (implement)

When: Discover and adopt features released after training Benefit: Leverage latest AWS capabilities

Pattern 3: Service Limit Verification

code

aws-docs (search quotas) → aws-docs (read limits) →
aws-expert (design within limits) → aws-api (validate)

When: Large-scale deployments, quota planning Benefit: Design within CURRENT limits (not outdated training data)

Best Practices

1. Always Start with aws-docs

•Search for best practices BEFORE designing infrastructure
•Read current documentation (not just training knowledge)
•Check "New" recommendations for post-training features
•Verify service limits (may have increased)

2. Validate with aws-api

•Check existing infrastructure before deployment
•Verify prerequisites exist (VPC, IAM roles, etc.)
•Get current resource state for informed decisions

3. Document Documentation Sources

•Record which AWS doc URLs informed design
•Note new features discovered (for team knowledge)
•Track service limit changes (for capacity planning)

4. Human Implements (aws-api Read-Only)

•aws-expert provides recommendations
•Human implements via Console/CLI/IaC
•aws-api validates post-deployment

5. Monitor for Documentation Updates

•Periodically check "New" recommendations
•Review service announcements for limit increases
•Update deployment patterns as AWS evolves

Success Metrics

Documentation Currency

•Best practices: 100% current (not outdated training data)
•Service limits: CURRENT limits (not Jan 2025 limits)
•New features: Discovered via "New" recommendations

Deployment Quality

•First-time success: >90% (informed by current docs)
•Security posture: Current best practices applied
•Cost optimization: Right-sized based on latest guidance

Knowledge Transfer

•Team learning: New features documented and shared
•Pattern library: Current patterns captured for reuse
•Incident reduction: Fewer misconfigurations from outdated knowledge

Related Patterns

•AWS Infrastructure Optimization: (Pattern to be created)
•ECS Deployment Runbook: knowledge/applications/customer-dashboard/deployment/
•aws-expert Agent: .claude/agents/specialists/aws-expert.md

Created: 2025-10-08 Pattern Type: Cross-Tool Integration Confidence: HIGH (0.90) - Documentation currency critical