Drift Detection and Resolution Skill
Overview
This skill helps investigate and resolve infrastructure drift in HCP Terraform workspaces. Drift occurs when actual infrastructure state diverges from the Terraform configuration.
Prerequisites
- •HCP Terraform Plus or Enterprise (health assessments feature)
- •Health assessments enabled in workspace settings
- •Authenticated with
hcptfCLI (TFE_TOKENor~/.terraform.d/credentials.terraform.io)
Core Concepts
Drift Types:
- •delete: Resource was deleted outside Terraform - usually needs
terraform applyto recreate - •update: Resource was modified outside Terraform - may need code update or apply
- •create: Resource was created outside Terraform - may need import or removal
Resolution Strategies:
- •Apply fixes infrastructure: Run
terraform applyto make infrastructure match code - •Update code: Modify Terraform configuration to match actual infrastructure state
- •Update variables: Change variable values if drift is due to outdated variable values
- •Import resources: Import manually created resources into Terraform state
Workflow
1. Find Drifted Workspaces
Use Explorer API to find workspaces with drift:
# Find all drifted workspaces in organization hcptf explorer query -org=<org-name> -type=workspaces \ -fields=workspace-name,drifted,resources-drifted,resources-undrifted \ | grep -v "false" # Sort by most drifted resources hcptf explorer query -org=<org-name> -type=workspaces \ -sort=-resources-drifted -page-size=20
2. View Drift Details
Get detailed drift information for a workspace:
# View drift and check results (URL-style) hcptf <org> <workspace> assessments # Or flag-based hcptf assessmentresult list -org=<org-name> -name=<workspace-name>
This shows:
- •Which resources drifted
- •What actions are needed (delete, update, create)
- •Which attributes changed (before/after values)
- •Terraform check failures (continuous validation)
3. Understand the Configuration
Get workspace VCS information:
# View workspace details including VCS repo hcptf explorer query -org=<org-name> -type=workspaces \ -fields=workspace-name,vcs-repo-identifier \ -filter="workspace-name:<workspace>"
Get commit information from the run:
# Get current run ID from workspace hcptf <org> <workspace> # Shows CurrentRunID # View configuration version with VCS commit info hcptf <org> <workspace> runs <run-id> configversion
Or with JSON parsing:
RUN_ID=$(hcptf <org> <workspace> -output=json | jq -r '.CurrentRunID') hcptf <org> <workspace> runs $RUN_ID configversion
For VCS-backed workspaces, this provides:
- •Branch name
- •Commit SHA
- •Commit URL (direct link to GitHub/GitLab commit)
- •Repo identifier (e.g.,
username/repo)
4. Analyze and Decide
Review the drift details to determine the appropriate action:
Decision Matrix:
| Action Type | Common Cause | Typical Resolution |
|---|---|---|
delete | Manual deletion, autoscaling | Apply to recreate |
update with infrastructure values | Manual changes, console edits | Apply to revert OR update code to match |
update with computed values | Normal operation (IPs, timestamps) | Update code/variables to match |
| Tags added | Tagging automation | Update code to include tags |
| Check failures | Cert expiration, thresholds | Varies by check type |
Questions to ask:
- •Is the current infrastructure state correct?
- •Should the code be updated to match reality?
- •Is this drift expected (computed values, auto-scaling)?
- •Do variables need updating?
5. Take Action
Option A: Fix infrastructure (apply)
# Create a run to fix drift hcptf run create -org=<org> -name=<workspace> \ -message="Fix drift: [describe what's being fixed]" # Monitor the run hcptf run show -id=<run-id>
Option B: Update code
- •Clone the repository (use VCS info from step 3)
- •Checkout the branch that's connected to the workspace
- •Update Terraform files to match actual infrastructure
- •Commit and push - this triggers a new run automatically
Option C: Update variables
# List current variables hcptf variable list -org=<org> -workspace=<workspace> # Update a variable hcptf variable update -org=<org> -workspace=<workspace> \ -key=<variable-name> -value=<new-value>
Option D: Import resources (for manually created resources)
# This requires terraform CLI access to the workspace terraform import <resource_address> <resource_id>
6. Verify Resolution
After taking action, verify drift is resolved:
# Check if drift is cleared (may take a few minutes for next health check) hcptf explorer query -org=<org> -type=workspaces \ -filter="workspace-name:<workspace>" \ -fields=workspace-name,drifted,resources-drifted # View latest assessment result hcptf assessmentresult list -org=<org> -name=<workspace>
Common Scenarios
Scenario 1: EC2 Instance Deleted (Autoscaling)
# 1. Identify drift hcptf my-org my-workspace assessments # Shows: aws_instance.app - Action: delete # 2. Decision: Recreate the instance # 3. Apply fix hcptf my-org my-workspace runs create \ -message="Recreate deleted EC2 instance"
Scenario 2: Route53 Record IP Changed
# 1. View drift hcptf my-org my-workspace assessments # Shows: aws_route53_record.app - Action: update # records: ["1.2.3.4"] -> ["5.6.7.8"] # 2. Get VCS info to see the code RUN_ID=$(hcptf my-org my-workspace -output=json | jq -r '.CurrentRunID') hcptf my-org my-workspace runs $RUN_ID configversion # Shows: CommitURL to review the configuration # 3. Decision: Is new IP correct? # - If YES: Update variable or code # - If NO: Run apply to revert # 4a. If updating variable: hcptf variable update -org=my-org -workspace=my-workspace \ -key=app_ip -value="5.6.7.8" # 4b. If reverting infrastructure: hcptf my-org my-workspace runs create \ -message="Revert Route53 record to configured IP"
Scenario 3: Tags Added by Automation
# 1. View drift
hcptf my-org my-workspace assessments
# Shows: multiple resources - Action: update
# tags: {} -> {"Environment": "prod", "Owner": "team-a"}
# 2. Get code location
RUN_ID=$(hcptf my-org my-workspace -output=json | jq -r '.CurrentRunID')
hcptf my-org my-workspace runs $RUN_ID configversion
# Shows: RepoIdentifier, Branch, CommitURL
# 3. Decision: Tags are correct, update code to include them
# 4. Clone repo, checkout branch, update Terraform files with tags
# 5. Commit and push (triggers auto-run if VCS-backed)
Scenario 4: Certificate Expiring (Check Failure)
# 1. View checks hcptf my-org my-workspace assessments # Shows: TERRAFORM CHECK RESULTS # tls_self_signed_cert.app - FAIL # • Certificate will expire in less than 4 hours # 2. Decision: Need to regenerate certificate # 3. Update code to generate new cert or increase validity period # 4. Apply changes hcptf my-org my-workspace runs create \ -message="Regenerate expiring certificate"
Tips and Best Practices
- •Regular monitoring: Check drift regularly using Explorer queries
- •Investigate before fixing: Always understand WHY drift occurred
- •Document decisions: Include clear messages when creating runs
- •Use continuous validation: Enable Terraform checks to catch issues early
- •Automate remediation: For known drift patterns, consider automated fixes
- •Review permissions: Drift often indicates someone has console access they shouldn't
- •Update documentation: If updating code to match infrastructure, document why
Troubleshooting
No assessment results found:
- •Health assessments may not be enabled in workspace settings
- •Feature requires HCP Terraform Plus or Enterprise
- •Enable via UI or
hcptf workspace update
401 Unauthorized on assessment results:
- •Check your authentication token is valid
- •Ensure you have at least read access to the workspace
Drift shows but no resources listed:
- •May be computed attribute changes (not shown in detail)
- •Check the log output URL for full details
Apply doesn't fix drift:
- •Drift may be due to external automation re-applying changes
- •Consider updating code to match the pattern
- •Investigate what's causing the external changes
Related Commands
- •
hcptf explorer query- Query workspaces and resources - •
hcptf assessmentresult list- View drift and check details - •
hcptf workspace read- Get workspace details - •
hcptf run create- Create a run to fix drift - •
hcptf run show- Monitor run progress - •
hcptf variable list/update- Manage workspace variables - •
hcptf configversion read- Get VCS commit information
Agent Considerations
When building agents that handle drift:
- •Assess severity: Check if drift is critical or expected
- •Get approval: Don't auto-fix drift without user review
- •Provide context: Explain what changed and why
- •Link to code: Use VCS URLs to show relevant code
- •Suggest actions: Present options with pros/cons
- •Verify resolution: Confirm drift is cleared after action