AgentSkillsCN

terraform-infrastructure

Terraform 与基础设施即代码标准。在编写 Terraform 模块、配置 AWS 资源、管理状态文件,或践行 IaC 架构模式时,可参考此标准。内容涵盖命名规范、标签管理,以及 IAM 策略的配置与应用。

SKILL.md
--- frontmatter
name: terraform-infrastructure
description: |
  Terraform and infrastructure-as-code standards. Use when writing Terraform
  modules, configuring AWS resources, managing state files, or implementing
  IaC patterns. Covers naming conventions, tagging, and IAM policies.
disposition: always
version: 1.1.0

Terraform Infrastructure Rules

You are an expert Terraform developer working for the Enterprise DevOps team. When generating or modifying Terraform code, you MUST strictly adhere to these rules and best practices.

Note: This skill contains enterprise-specific Terraform standards including custom tagging requirements, state file organization, and AWS-specific patterns. It complements the official HashiCorp terraform-style-guide skill which provides generic style conventions. When both skills are active, enterprise-specific rules in this skill take precedence.

Core Principles

  1. Follow Enterprise Style Guide: All code MUST conform to the company's established Terraform style guide
  2. Industry Best Practices: You MUST implement security, maintainability, and scalability best practices
  3. State Management: You MUST ensure proper state file isolation and dependency management
  4. Security First: You MUST implement least privilege access and secure resource configurations

Naming Conventions

Resource Names

  • You MUST use snake_case for all Terraform resource names, data sources, variables, and outputs
  • You MUST NOT repeat the resource type in the resource name
  • You SHOULD use this as the default name unless it creates ambiguity
  • For multiple instances, you MUST use descriptive names

Examples:

hcl
# Good
resource "aws_route_table" "public" {}
resource "aws_route_table" "this" {}
resource "aws_route_table" "private" {}

# Bad
resource "aws_route_table" "public_route_table" {}
resource "aws_route_table" "public_aws_route_table" {}

User-Facing Names

  • You MUST use dashes for user-facing names (AWS resource names, tags, etc.)
  • You MUST follow pattern: {environment}-{service}-{details}
  • You MUST use locals or variables for environment and service names

Examples:

hcl
locals {
  environment = "prod"
  service     = "api"
}

resource "aws_lambda_function" "this" {
  function_name = format("%s-%s", local.environment, local.service)
  # ...
}

File Structure and Organization

Module Structure

  • You MUST use standard files: main.tf, variables.tf, outputs.tf
  • You SHOULD create separate files for each AWS resource type
  • You MUST include README.md with module documentation
  • You SHOULD use terraform-docs for documentation generation

File Organization:

code
module/
├── main.tf
├── variables.tf
├── outputs.tf
├── iam.tf
├── security_groups.tf
├── s3.tf
├── cloudwatch.tf
├── README.md
└── versions.tf

State File Organization

  • Account Level: account_prep/, general/, shared_security_groups/
  • Service Level: Service-specific directories with max 4 state files:
    • code_deployment/ - Direct code deployment infrastructure
    • infra_deployment/ - Supporting infrastructure (no persistent data)
    • persistent_data/ - Databases, storage (requires significant review)
    • pipeline/ - Automated pipelines (consider changes to configurations and their impact to deployments)

Variables, Outputs, and Locals

Variables

  • You MUST declare all variables in variables.tf
  • You SHOULD sort alphabetically
  • You MUST include description and type for all variables
  • You SHOULD provide default with sensible non-production values when appropriate
  • You MUST use plural names for lists and maps
  • You SHOULD reuse argument names in variable names

Example:

hcl
variable "security_group_ids" {
  description = "IDs of the security groups for the service"
  type        = list(string)
  default     = []
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

Outputs

  • You MUST declare all outputs in outputs.tf
  • You SHOULD sort alphabetically
  • You MUST include descriptions
  • You MUST use plural names for lists and maps

Locals

  • You SHOULD use a single locals.tf file unless context improves readability
  • You SHOULD use locals for computed values and common configurations

Version Management

Terraform Versions

  • You MUST pin Terraform versions using required_version
  • You SHOULD use tfenv with .terraform-version files
  • You SHOULD use latest:^1\.[0-9]\+\.[0-9]\+$ syntax for using the latest v1.x.y stable release

Example:

hcl
terraform {
  required_version = "~> 1.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.0"
    }
  }
}

Provider Versions

  • You MUST pin to major versions (e.g., aws ~> 6.0)
  • You MUST use specific tags for external modules

Security and IAM

IAM Policies

  • You MUST use aws_iam_policy_document data sources for policy definitions
  • You MUST follow least privilege principle
  • You MUST include descriptive SIDs for each statement
  • You SHOULD separate policy documents from policy resources

Example:

hcl
data "aws_iam_policy_document" "lambda_execution" {
  statement {
    sid = "LambdaExecutionRole"
    actions = [
      "logs:CreateLogGroup",
      "logs:CreateLogStream",
      "logs:PutLogEvents"
    ]
    effect    = "Allow"
    resources = ["*"]
  }
}

resource "aws_iam_policy" "lambda_execution" {
  name   = format("%s-lambda-execution", var.environment)
  policy = data.aws_iam_policy_document.lambda_execution.json
}

Security Groups

  • You MUST use separate aws_security_group_rule resources instead of inline rules
  • You SHOULD reference existing security groups via data sources when possible
  • You MUST follow naming conventions for security group names

Tagging Strategy

Required Tags

Implement comprehensive tagging following the Enterprise cost allocation guidelines:

hcl
locals {
  common_tags = {
    Name             = format("%s-%s", var.environment, var.service)
    environment      = var.environment
    iac_source     = var.iac_source
    product          = var.product
    product_line     = var.product_line
    service          = var.service
    terraform       = true
  }
}

All resources that support tagging MUST be tagged.

Resource Configuration Best Practices

General Rules

  • You MUST use relative paths with file() helper
  • You SHOULD prefer separate resources over inline blocks
  • You MUST define AWS region as a variable in modules
  • You SHOULD use data sources for existing resources when appropriate
  • You MUST implement proper dependency management

State Management

  • You MUST use S3 remote state with versioning, encryption, and DynamoDB locking:

    hcl
    backend "s3" {
        bucket         = "enterprise-terraform-state"
        key            = "env/mgmt/security/service.tfstate"
        region         = "us-east-1"
        encrypt        = true
        kms_key_id     = "arn:aws:kms:us-east-1:123456789012:key/e8ae78af-6a84-47cd-a5e0-bf1e4901ee59"
        dynamodb_table = "enterprise-terraform-locks"
        assume_role = {
          role_arn = "arn:aws:iam::123456789012:role/terraform-automation"
        }
      }
    
  • You MUST implement proper state file isolation

  • You SHOULD avoid cross-dependencies between state files unless explicitly declared

  • You MUST use depends_on for explicit dependencies

Error Handling

  • You MUST implement proper validation in variables
  • You SHOULD use precondition and postcondition blocks for resource validation
  • You MUST handle empty/default values gracefully

Code Quality and Formatting

Sorting and Organization

  • You SHOULD sort variables, outputs, and locals alphabetically
  • You SHOULD sort resource arguments alphabetically when possible
  • You MUST place count as the first line in resource blocks
  • You MUST place tags as the last argument (before depends_on and lifecycle)

Comments and Documentation

  • You MUST include meaningful descriptions for all variables and outputs
  • You SHOULD document complex logic with comments
  • You MUST use consistent comment formatting

Module Development

Module Structure

  • You MUST follow the cookiecutter-terraform-module structure
  • You MUST include comprehensive README.md with usage examples
  • You MUST implement proper variable validation
  • You MUST use semantic versioning for releases

Testing

  • You SHOULD include test examples in test/ directory
  • You SHOULD test with different variable combinations
  • You MUST validate module outputs

Secrets Management

KMS Integration

  • You MUST use AWS KMS for secret encryption
  • You SHOULD create service-specific KMS keys when appropriate
  • You SHOULD use encryption context for additional security
  • You MUST store encrypted secrets as data sources

Example:

hcl
data "aws_kms_secrets" "database" {
  secret {
    name    = "master_password"
    payload = var.encrypted_password
    context = {
      resource = "database"
      key      = "password"
    }
  }
}

Deployment Considerations

Automation Compatibility

  • You MUST ensure all resources can be created without targeting
  • You SHOULD avoid dependencies on external state files
  • You MUST use proper variable defaults for different environments
  • You MUST implement proper error handling for missing dependencies

Environment Management

  • You MUST use consistent naming across environments
  • You MUST implement environment-specific configurations
  • You SHOULD use workspaces or separate state files for different environments

Performance and Scalability

Resource Optimization

  • You SHOULD use data sources efficiently
  • You MUST implement proper resource lifecycle management
  • You SHOULD use for_each instead of count when appropriate
  • You SHOULD optimize for parallel resource creation

State File Optimization

  • You MUST keep state files focused and manageable
  • You MUST avoid overly large state files
  • You MUST use proper state file isolation

Compliance and Governance

Audit Trail

  • You MUST implement comprehensive logging
  • You MUST use consistent naming for audit purposes
  • You SHOULD include proper resource descriptions
  • You MUST maintain clear dependency documentation

Cost Management

  • You MUST implement proper resource sizing
  • You MUST use appropriate instance types and storage classes
  • You MUST implement cost allocation tags
  • You SHOULD monitor and optimize resource usage

Error Prevention

Common Pitfalls to Avoid

  • You MUST NOT use hardcoded values for environment-specific configurations
  • You MUST NOT create circular dependencies between resources
  • You MUST NOT use terraform destroy without proper planning
  • You MUST validate changes with terraform plan before applying
  • You MUST use proper variable validation
  • You MUST implement proper error handling

Validation Rules

  • You MUST validate all variable inputs
  • You MUST use proper type constraints
  • You SHOULD implement business logic validation
  • You SHOULD test edge cases and error conditions

When Generating Code

  1. You MUST follow the Enterprise naming conventions
  2. You MUST include proper variable definitions with descriptions
  3. You MUST implement comprehensive tagging
  4. You MUST use proper resource dependencies
  5. You MUST include error handling and validation
  6. You MUST follow security best practices
  7. You MUST ensure automation compatibility
  8. You MUST include proper documentation
  9. You MUST use version pinning
  10. You MUST implement proper state management

Remember: The goal is to create maintainable, secure, and scalable infrastructure code that follows established patterns and industry best practices. All rules in this document use RFC 2119 terminology where MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY have specific meanings that define the requirement level.