Terraform Troubleshooting Skill
Table of Contents
Quick Start → What Is This | When to Use | Simple Example
How to Implement → Step-by-Step | Examples
Help → Requirements | See Also
Purpose
Troubleshoot Terraform errors efficiently using systematic debugging workflows, detailed error analysis, and proven solutions for common problems.
When to Use
Use this skill when you encounter:
- •Terraform failures - Apply or plan commands fail with errors
- •State lock issues - Lock timeout or concurrent modification errors
- •Provider errors - GCP API errors, authentication failures
- •Syntax problems - Invalid HCL, type mismatches, missing arguments
- •Unexpected infrastructure changes - State drift, unintended modifications
- •Version conflicts - Provider or Terraform version incompatibilities
- •Permission errors - GCP IAM or service account issues
Error Categories:
- •Language Errors - Syntax, configuration, type mismatches
- •State Errors - Lock timeouts, corruption, concurrent access
- •Core Errors - Terraform version, plugin issues
- •Provider Errors - GCP API, permissions, authentication
Trigger Phrases:
- •"Terraform apply failed"
- •"Fix state lock error"
- •"Debug Terraform syntax error"
- •"Resolve GCP permission denied"
- •"Fix state drift"
- •"Terraform version incompatibility"
Quick Start
Diagnose a Terraform error in 5 minutes:
# 1. Enable debug logging export TF_LOG=DEBUG export TF_LOG_PATH=/tmp/terraform.log # 2. Validate syntax terraform validate # 3. Run plan with detailed output terraform plan -out=tfplan # 4. Review logs for errors cat /tmp/terraform.log | grep -i error # 5. Disable logging when done unset TF_LOG unset TF_LOG_PATH
Instructions
Step 1: Categorize the Error
Terraform errors fall into four categories. Identify which type you're dealing with:
1. Language Errors (Syntax, configuration)
- •Invalid HCL syntax
- •Type mismatches
- •Missing required arguments
- •Example:
Error: Invalid value for module argument
2. State Errors (State lock, corruption)
- •Lock timeouts
- •Concurrent modifications
- •State file corruption
- •Example:
Error: Error acquiring the state lock
3. Core Errors (Terraform version, plugins)
- •Version incompatibility
- •Missing plugins
- •Initialize issues
- •Example:
Error: Unsupported Terraform version
4. Provider Errors (GCP API, permissions)
- •GCP API errors
- •Authentication issues
- •Permission denied
- •Example:
Error: Error creating PubSub topic: googleapi: Error 403
Step 2: Enable Detailed Logging
# Set debug logging export TF_LOG=DEBUG export TF_LOG_PATH=/tmp/terraform.log # Available levels: TRACE, DEBUG, INFO, WARN, ERROR # TRACE: Most verbose, includes all operations # DEBUG: Detailed, good for troubleshooting # INFO: General information
Reading Logs:
# Filter for errors cat /tmp/terraform.log | grep -i error # Filter for specific resource cat /tmp/terraform.log | grep "google_pubsub" # Filter for timestamps cat /tmp/terraform.log | grep "2025-11-14"
Step 3: Execute Troubleshooting Workflow
Follow this sequence for systematic debugging:
# 1. Validate HCL syntax terraform validate # ✓ Catches syntax, type, and required argument errors # ✗ Does NOT validate against actual cloud state # 2. Format code (catches formatting issues) terraform fmt -check -recursive terraform fmt -recursive # Fix formatting # 3. Refresh state (sync with actual infrastructure) terraform refresh # ✓ Updates Terraform state to match real infrastructure # ✗ Does NOT make changes, only reads # 4. Re-initialize (if provider issues) terraform init -upgrade # ✓ Updates provider versions to latest compatible # ✗ Requires time for downloads # 5. Plan with detailed output terraform plan -out=tfplan # ✓ Shows exactly what will change # ✗ Does NOT make changes # 6. Check logs grep -i error /tmp/terraform.log
Step 4: Handle Specific Error Types
State Lock Errors
Problem: Another Terraform operation is running or left a stale lock.
# Option 1: Wait for lock (if operation is legitimately running) terraform apply -lock-timeout=10m # Option 2: Force unlock (use with caution!) terraform force-unlock LOCK_ID # Get LOCK_ID from error message # Option 3: Manual recovery (last resort) # Delete lock file from GCS backend gsutil rm gs://bucket/prefix/default.tflock
Prevention:
- •Use CI/CD with job queuing (prevents concurrent runs)
- •Communicate with team before applying
- •Use Terraform Cloud/Enterprise for automatic locking
Cycle Errors (Circular Dependencies)
Problem: Resources depend on each other in a circle.
Error: Cycle: resource_a, resource_b, resource_a
Solution: Break the cycle by using depends_on or reordering:
# ❌ BAD: Circular reference
resource "google_compute_firewall" "allow_app" {
source_tags = [google_compute_instance.app.tags[0]]
}
resource "google_compute_instance" "app" {
tags = [google_compute_firewall.allow_app.name]
}
# ✅ GOOD: Break dependency
resource "google_compute_firewall" "allow_app" {
source_tags = ["app"] # Use explicit string instead
}
resource "google_compute_instance" "app" {
tags = ["app"] # Explicit value
}
Provider Version Conflicts
Problem: Provider version constraint conflict.
Error: Incompatible provider version Terraform requires >= 5.26.0, < 5.27.0 You have 6.0.0 installed
Solution:
# 1. Check current version
terraform version
# 2. Lock to compatible version
# In main.tf
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.26.0" # Allows 5.26.x, not 5.27.0
}
}
}
# 3. Re-initialize
terraform init -upgrade
# 4. Commit .terraform.lock.hcl
git add .terraform.lock.hcl
git commit -m "lock: pin Google provider to 5.26.0"
GCP Permission Errors
Problem: Service account lacks required GCP permissions.
Error: Error creating PubSub topic: googleapi: Error 403: The caller does not have permission
Solution:
# 1. Check current authentication gcloud auth list gcloud config get-value project # 2. Verify service account permissions gcloud projects get-iam-policy ecp-wtr-supplier-charges-prod \ --flatten="bindings[].members" \ --filter="bindings.members:serviceAccount:app-runtime@*" # 3. Grant required role gcloud projects add-iam-policy-binding ecp-wtr-supplier-charges-prod \ --member="serviceAccount:app-runtime@project.iam.gserviceaccount.com" \ --role="roles/pubsub.editor" # 4. Re-plan terraform plan
State Out of Sync
Problem: Terraform state doesn't match actual infrastructure.
# Detect drift terraform plan # Shows changes that don't exist in your .tf files # Sync state terraform refresh # Updates state to match real infrastructure # Manual fix (if refresh fails) terraform import google_pubsub_topic.incoming \ projects/ecp-wtr-supplier-charges-prod/topics/my-topic # Remove from state (if resource manually deleted) terraform state rm google_pubsub_topic.incoming
Step 5: Review and Recover
# View recent state changes terraform state list terraform state show google_pubsub_topic.incoming # See what changed in last apply terraform show tfplan | head -50 # Rollback by re-applying previous configuration git checkout HEAD~1 # Go back one commit terraform plan terraform apply
Examples
Example 1: Debugging State Lock
# Error occurs # Error: Error acquiring the state lock # Lock Info: # ID: abc123def456 # Path: gs://terraform-state-prod/supplier-charges-hub/default.tflock # Created: 2025-11-14 10:30:00 UTC # Step 1: Check if operation is running gcloud compute operations list --filter="status:RUNNING" # Step 2: If no running operation, force unlock terraform force-unlock abc123def456 # Step 3: If force-unlock fails, delete lock file gsutil rm gs://terraform-state-prod/supplier-charges-hub/default.tflock # Step 4: Re-plan to verify state is correct terraform refresh terraform plan
Example 2: Fixing Syntax Error
# Error occurs
# Error: Invalid value for module argument
# Step 1: Validate syntax
terraform validate
# Output shows exactly what's wrong:
# Error: Missing required argument
# on pubsub.tf line 5, in resource "google_pubsub_topic" "topics":
# 5: resource "google_pubsub_topic" "topics" {
# The argument "name" is required, but was not set.
# Step 2: Review and fix the file
# Add missing argument:
resource "google_pubsub_topic" "topics" {
name = "my-topic" # Add this
}
# Step 3: Validate again
terraform validate
Example 3: GCP Permission Recovery
# Error occurs # Error creating PubSub topic: googleapi: Error 403 # Step 1: Check authentication gcloud auth list gcloud config get-value project # Step 2: Get current IAM bindings gcloud projects get-iam-policy ecp-wtr-supplier-charges-prod # Step 3: Add Pub/Sub Editor role gcloud projects add-iam-policy-binding ecp-wtr-supplier-charges-prod \ --member="serviceAccount:terraform@project.iam.gserviceaccount.com" \ --role="roles/pubsub.editor" # Step 4: Re-run Terraform terraform plan terraform apply
Requirements
- •Terraform 1.x+ installed
- •GCP credentials configured (gcloud auth or service account key)
- •Logging environment (for TF_LOG_PATH)
- •GCP CLI tools installed (
gcloud,gsutil)
See Also
- •terraform-basics - General Terraform reference
- •terraform-state-management - Advanced state patterns
- •terraform-gcp-integration - GCP-specific issues