AgentSkillsCN

tag-coverage-analysis

评估云资源标签的质量与覆盖率,识别未打标或标签欠佳的资源,计算标签覆盖百分比,进而提升成本分摊的准确性,为 Showback 和 Chargeback 提供更可靠的数据支持

SKILL.md
--- frontmatter
name: tag-coverage-analysis
description: Evaluates cloud resource tagging quality and coverage to identify untagged or poorly tagged resources, calculate tag coverage percentages, and improve cost allocation accuracy for showback and chargeback
author: CloudZero <support@cloudzero.com>
version: 1.0.0
license: Apache-2.0

Tag Coverage Analysis

Purpose

This skill evaluates the quality and completeness of cloud resource tagging to improve cost allocation, enable accurate showback/chargeback, and ensure governance compliance.

When to Use

  • "What's our tag coverage?"
  • "How many resources are untagged?"
  • "Show me untagged costs"
  • "Analyze our tagging quality"
  • "Which accounts have poor tagging?"
  • Setting up showback/chargeback
  • Improving cost allocation
  • Governance and compliance reviews
  • Identifying unattributed costs
  • Keywords: tag, tagging, untagged, coverage, attribution, showback, chargeback, governance

Prerequisites

This skill builds on the understand-cloudzero-organization skill.

Before applying this procedure:

  • If you haven't already in this session, load the understand-cloudzero-organization skill and follow its instructions
  • Reference the cached organization context (don't reload unnecessarily)

How This Skill Works

Step 1: Discover Available Tags

Identify what tags exist in the organization:

code
get_available_dimensions(filter="Tag")

This returns all tags in use. Common tags include:

  • CZ:Tag:Environment
  • CZ:Tag:Team
  • CZ:Tag:Application
  • CZ:Tag:Owner
  • CZ:Tag:CostCenter
  • CZ:Tag:Project

Step 2: Calculate Overall Tag Coverage

For each critical tag, calculate coverage:

code
# Get total costs
get_cost_data(cost_type="real_cost")

# Get costs WITH the tag (any value)
get_cost_data(
    group_by=["CZ:Tag:Environment"],
    cost_type="real_cost"
)

# Get costs WITHOUT the tag (calculate as difference)

Coverage Formula:

code
Tag Coverage % = (Cost of Tagged Resources / Total Cost) * 100
Untagged Cost = Total Cost - Cost of Tagged Resources
Untagged % = (Untagged Cost / Total Cost) * 100

Step 3: Analyze Each Critical Tag

For each important tag (Environment, Team, Application, etc.):

Tag Distribution:

code
get_cost_data(
    group_by=["CZ:Tag:Environment"],
    limit=50
)

Identify untagged costs by dimension:

code
# Untagged by Account
get_cost_data(
    group_by=["CZ:Account"],
    # Note: Can't directly filter for "no tag", so compare to total by account
)

# Untagged by Service
get_cost_data(
    group_by=["CZ:Service"],
)

# Then identify which services have low tag coverage

Step 4: Multi-Tag Analysis

Analyze coverage across multiple tags simultaneously:

code
get_cost_data(
    group_by=["CZ:Tag:Environment", "CZ:Tag:Team"],
    limit=100
)

Identify resources that have:

  • All critical tags (good)
  • Some critical tags (partial)
  • No critical tags (bad)

Step 5: Account-Level Tag Coverage

Identify which accounts have tagging issues:

code
# For each account, analyze tag coverage
get_cost_data(
    group_by=["CZ:Account", "CZ:Tag:Environment"],
    limit=100
)

Calculate coverage % per account and rank accounts by:

  • Worst tagging coverage
  • Highest untagged cost amounts
  • Most critical accounts with tagging issues

Step 6: Service-Level Tag Coverage

Identify which services are commonly untagged:

code
get_cost_data(
    group_by=["CZ:Service", "CZ:Tag:Environment"],
    limit=100
)

Some services may be harder to tag (e.g., data transfer, some AWS service fees). Identify:

  • Which services have good tag coverage
  • Which services have poor tag coverage
  • Which services are untaggable (inherently)

Step 7: Time-Based Coverage Trends

Analyze if tagging is improving or degrading:

code
# Current month
get_cost_data(
    group_by=["CZ:Tag:Environment"],
    date_range="this month"
)

# Previous month
get_cost_data(
    group_by=["CZ:Tag:Environment"],
    date_range="last month"
)

# Three months ago
get_cost_data(
    group_by=["CZ:Tag:Environment"],
    date_range="3 months ago"
)

Calculate coverage % for each period and identify trend.

Step 8: Tag Value Quality Analysis

Beyond just having tags, analyze tag value quality:

Identify problematic patterns:

  • Empty values: tag exists but value is ""
  • Inconsistent naming: "prod" vs. "production" vs. "Prod"
  • Typos: "productin"
  • Invalid values: "test123", "temp", "unknown"
  • Non-standard formats: different date formats, different naming conventions
code
get_dimension_values(dimension="CZ:Tag:Environment")

Review all values and flag:

  • Inconsistent capitalization
  • Unclear or uninformative values
  • Values that should be consolidated

Step 9: Highest-Value Untagged Resources

Prioritize tagging efforts by identifying expensive untagged resources:

code
# Get top services, then check which have poor tag coverage
get_cost_data(
    group_by=["CZ:Service"],
    limit=20
)

# For each high-cost service, analyze tag coverage
get_cost_data(
    filters={"CZ:Service": ["AmazonEC2"]},
    group_by=["CZ:Tag:Environment", "CZ:Account"],
    limit=50
)

Focus on:

  • Highest cost untagged resources (most impact)
  • Easy-to-tag resources (quick wins)
  • Services critical for showback/chargeback

Output Format

Provide comprehensive tagging analysis:

1. Executive Summary

  • Overall tag coverage: XX%
  • Total untagged costs: $X,XXX (XX% of total)
  • Number of tags in use: X
  • Tagging trend: [Improving/Degrading/Stable]
  • Priority action: [Top recommendation]

2. Tag Coverage Scorecard

Critical Tags:

TagCoverage %Tagged CostUntagged CostStatus
EnvironmentXX%$X,XXX$X,XXX⚠️/✅
TeamXX%$X,XXX$X,XXX⚠️/✅
ApplicationXX%$X,XXX$X,XXX⚠️/✅
CostCenterXX%$X,XXX$X,XXX⚠️/✅
OwnerXX%$X,XXX$X,XXX⚠️/✅

Overall Tag Health:

  • Excellent (>90%): ✅ [X tags]
  • Good (70-90%): ⚠️ [X tags]
  • Poor (<70%): ❌ [X tags]

3. Untagged Cost Breakdown

Total Untagged: $X,XXX (XX% of total cloud spend)

By Service:

ServiceTotal CostUntagged CostUntagged %Priority
Service A$X,XXX$X,XXXXX%High
Service B$X,XXX$X,XXXXX%Medium
...............

By Account:

AccountTotal CostUntagged CostUntagged %Status
Account A$X,XXX$X,XXXXX%
Account B$X,XXX$X,XXXXX%⚠️
...............

By Region:

RegionUntagged Cost% of Total Untagged
Region A$X,XXXXX%
Region B$X,XXXXX%
.........

4. Tag Distribution Analysis

For each critical tag, show value distribution:

Environment Tag Distribution:

  • production: $X,XXX (XX%)
  • staging: $X,XXX (XX%)
  • development: $X,XXX (XX%)
  • UNTAGGED: $X,XXX (XX%) ⚠️

Team Tag Distribution:

  • UNTAGGED: $X,XXX (XX%) ⚠️

5. Multi-Tag Coverage Matrix

Resources with Complete Tagging:

EnvironmentTeamApplicationCostStatus
productionTeam AApp X$X,XXX✅ Fully Tagged
stagingTeam BApp Y$X,XXX✅ Fully Tagged

Resources with Partial Tagging:

EnvironmentTeamApplicationCostMissing Tags
production(untagged)App X$X,XXXTeam
(untagged)Team AApp Y$X,XXXEnvironment

Resources with No Tags:

AccountServiceCost
Account AEC2$X,XXX
Account BRDS$X,XXX

6. Tag Quality Issues

Inconsistent Values:

  • Environment tag has: "prod", "production", "Prod", "PRODUCTION"
    • Should standardize to: "production"
    • Affects: $X,XXX across Y resources

Invalid/Unclear Values:

  • Team tag has: "test", "temp", "unknown", "tbd"
    • Should be replaced with actual team names
    • Affects: $X,XXX

Formatting Issues:

  • Date tags using inconsistent formats
  • Case sensitivity problems
  • Whitespace issues

7. Coverage Trend

Historical Coverage:

MonthEnvironment TagTeam TagOverall
3 months agoXX%XX%XX%
2 months agoXX%XX%XX%
Last monthXX%XX%XX%
This monthXX%XX%XX%

Trend Analysis:

  • Coverage is [improving/degrading/stable]
  • Rate of change: +/-X% per month
  • If current trend continues: [projection]

8. Impact on Cost Allocation

Current State:

  • Can accurately allocate: $X,XXX (XX%)
  • Cannot allocate: $X,XXX (XX%)

Business Impact:

  • Showback/chargeback accuracy: XX%
  • Teams/projects with unclear attribution: [List]
  • Budget allocation challenges: [Details]

If tagging improved to 95%:

  • Additional allocated costs: $X,XXX
  • Improved allocation accuracy: +XX%
  • Better cost visibility for [X teams]

9. Actionable Recommendations

Immediate Actions (High Priority):

  1. Tag top 10 untagged resources - Would cover $X,XXX (XX% of untagged)

    • Specific: [Account A] EC2 instances in [Region]
    • Specific: [Account B] RDS databases
  2. Fix Account [X] tagging - Worst offender with XX% untagged

    • Total impact: $X,XXX
    • Recommended: Implement tag enforcement policy
  3. Standardize Environment tag values - Consolidate X variations

    • Current: "prod", "production", "Prod"
    • Target: "production"
    • Affects: $X,XXX

Short-Term Actions (1-2 weeks):

  1. Implement tag policies in [Accounts] - Prevent new untagged resources
  2. Create tagging automation - Auto-tag resources on creation
  3. Set up tag compliance monitoring - Alert on untagged resource creation
  4. Audit service [X] - High cost, low tag coverage

Long-Term Actions (1-3 months):

  1. Establish tagging governance - Formal policies and enforcement
  2. Tag remediation project - Systematic cleanup of existing resources
  3. Integrate tagging into IaC - Terraform/CloudFormation templates
  4. Training and documentation - Educate teams on tagging standards

10. Account-Specific Action Plan

For each account with poor tagging:

Account: [Account Name/ID]

  • Total Cost: $X,XXX
  • Untagged: $X,XXX (XX%)
  • Worst Service: [Service] with $X,XXX untagged
  • Recommended Actions:
    1. [Specific action]
    2. [Specific action]
  • Owner: [If known from org context]
  • Priority: [High/Medium/Low]

11. Tag Enforcement Recommendations

Preventive Measures:

  1. AWS Organizations Tag Policies

    • Require tags: Environment, Team, CostCenter
    • Enforce on resource creation
    • Target accounts: [List]
  2. Automated Tagging

    • Tag Lambda to auto-tag resources on creation
    • Inherit tags from VPC, subnet, or other parent resources
    • Default tags for specific resource types
  3. CI/CD Integration

    • Tag validation in infrastructure pipelines
    • Reject deployments without required tags
    • Automated tag suggestions

Detective Measures:

  1. Daily tagging reports

    • Email to account owners
    • Include cost impact
    • Track coverage trends
  2. CloudZero alerts

    • Alert on new untagged resources above $X threshold
    • Weekly summaries of tagging violations

Skill-Specific Best Practices

  1. Focus on financial impact - Prioritize by cost, not just resource count
  2. Be specific - Identify exact accounts, services, resources to tag
  3. Calculate coverage properly - Some costs may be inherently untaggable
  4. Track trends - Is it getting better or worse?
  5. Provide actionable steps - Not just "tag your resources," but specific items
  6. Consider tag value quality - Having a tag with bad data isn't much better than no tag
  7. Differentiate untaggable costs - Some AWS fees can't be tagged, exclude from coverage calculations

For general cost analysis best practices, see ${CLAUDE_PLUGIN_ROOT}/references/best-practices.md

Common Tagging Challenges

Challenge 1: Inherited or Shared Resources

Some resources (data transfer, NAT gateways) serve multiple purposes.

Solution:

  • Use CloudZero's custom dimensions for allocation
  • Apply tags at infrastructure level where possible
  • Use CostFormation rules for attribution

Challenge 2: Third-Party Services

Marketplace or third-party charges may not support tags.

Solution:

  • Tag at billing account level
  • Use account structure for isolation
  • Document and exclude from coverage calculations

Challenge 3: Legacy Resources

Old resources created before tagging policies.

Solution:

  • Prioritize by cost for remediation
  • Automate where possible (tag by naming convention, etc.)
  • Set sunset dates for untagged resources

Challenge 4: Inconsistent Tag Values

Many variations of same concept.

Solution:

  • Document standard values
  • Use validation at creation
  • Bulk remediation to standardize
  • Use CloudZero to normalize variations

Advanced Techniques

Tagging Maturity Score

Create composite score:

code
Maturity Score = (
  (Coverage % × 0.4) +
  (Value Quality % × 0.3) +
  (Multi-tag Coverage % × 0.2) +
  (Enforcement Implementation % × 0.1)
)

Cost-Weighted Coverage

Give more weight to expensive resources:

code
Weighted Coverage = Σ(Cost × Coverage) / Total Cost

Tag Dependency Analysis

Some tags depend on others:

  • If "Environment:production", should have "BackupPolicy:daily"
  • If "Team:data", should have "DataClassification" tag

Attribution Gap Analysis

Calculate how much cost cannot be attributed:

code
Attribution Gap = (Untagged Cost + Partial Tag Cost) / Total Cost

Tips for Effective Analysis

  1. Quantify impact - Show dollar amounts, not just percentages
  2. Be specific - Name exact accounts, services, resources
  3. Prioritize - Focus on highest-cost untagged resources first
  4. Trend it - Show if improving or degrading over time
  5. Compare accounts - Highlight good and bad performers
  6. Provide templates - Suggest specific tag policies and automation
  7. Connect to business goals - Link tagging to showback/chargeback needs

See Also

  • understand-cloudzero-organization skill - Load organization context first
  • ${CLAUDE_PLUGIN_ROOT}/references/best-practices.md - Universal cost analysis best practices
  • ${CLAUDE_PLUGIN_ROOT}/references/cloudzero-tools-reference.md - Complete tool documentation
  • ${CLAUDE_PLUGIN_ROOT}/references/error-handling.md - Troubleshooting and common errors
  • ${CLAUDE_PLUGIN_ROOT}/references/dimensions-reference.md - Dimension types and FQDIDs
  • ${CLAUDE_PLUGIN_ROOT}/references/cost-types-reference.md - When to use each cost type