Purpose
Enforce Helm chart quality and security standards across the helm-charts/ directory through automated checks.
What it checks (13 checks):
- •Image Tags (no latest, mutable tags) - HIGH
- •Security Context (runAsNonRoot, no privileged) - HIGH
- •Resource Limits (requests, limits, memory) - HIGH
- •RBAC Wildcards (no * permissions) - HIGH
- •Health Probes (liveness, readiness) - HIGH
- •Helm Lint (official helm validation) - HIGH
- •Chart Metadata (apiVersion, version, maintainers) - MEDIUM
- •Chart Structure (README, NOTES.txt, _helpers.tpl) - MEDIUM
- •Dependencies (pinned versions, Chart.lock) - MEDIUM
- •Deprecated APIs (no v1beta1, use stable APIs) - MEDIUM
- •Argo Rollouts (strategy, analysis, steps) - MEDIUM
- •Ingress TLS (certificates, annotations) - MEDIUM
- •GPU Resources (nvidia.com/gpu, tolerations) - LOW
Running Checks
Full audit (all checks):
node .claude/skills/helm-charts-audit/scripts/run_all_checks.mjs
Generate report (all checks + markdown report):
node .claude/skills/helm-charts-audit/scripts/generate_report.mjs
Report saved to: reports/YYYY-MM-DD/helm-charts-audit.md
Individual checks:
node .claude/skills/helm-charts-audit/scripts/check_image_tags.mjs node .claude/skills/helm-charts-audit/scripts/check_security_context.mjs node .claude/skills/helm-charts-audit/scripts/check_resource_limits.mjs node .claude/skills/helm-charts-audit/scripts/check_rbac_wildcards.mjs node .claude/skills/helm-charts-audit/scripts/check_health_probes.mjs node .claude/skills/helm-charts-audit/scripts/check_helm_lint.mjs node .claude/skills/helm-charts-audit/scripts/check_chart_metadata.mjs node .claude/skills/helm-charts-audit/scripts/check_chart_structure.mjs node .claude/skills/helm-charts-audit/scripts/check_dependencies.mjs node .claude/skills/helm-charts-audit/scripts/check_deprecated_apis.mjs node .claude/skills/helm-charts-audit/scripts/check_argo_rollouts.mjs node .claude/skills/helm-charts-audit/scripts/check_ingress_tls.mjs node .claude/skills/helm-charts-audit/scripts/check_gpu_resources.mjs
Quality Rules
1. Image Tags (HIGH)
RULE: Never use mutable tags. latest tag = unpredictable deployments + rollback failures.
Violations:
- •
image: nginx:latest- mutable, changes without notice - •
image: nginx- defaults to :latest - •
tag: ""- empty tag in values.yaml - •
tag: head,tag: canary,tag: dev- mutable branch tags
Fix: Use immutable tags like v1.2.3, SHA digests sha256:abc123, or SemVer 1.21.0.
2. Security Context (HIGH)
RULE: Containers must run with minimal privileges. Privileged containers = cluster takeover risk.
Violations:
- •
privileged: true- full host access, container escape trivial - •
runAsNonRoot: false- runs as root user UID 0 - •
runAsUser: 0- explicitly root - •
allowPrivilegeEscalation: true- can gain more privileges - •
hostNetwork: true- shares host network namespace - •
hostPID: true- can see/kill host processes - •
hostIPC: true- can access host shared memory - •
readOnlyRootFilesystem: false- malware can write anywhere - •
capabilities.add: [SYS_ADMIN]- near-root level access - •
capabilities.add: [ALL]- equivalent to privileged
Fix: Add proper securityContext with runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities.drop: [ALL].
3. Resource Limits (HIGH)
RULE: All containers must have resource requests and limits. No limits = node OOM + noisy neighbor issues.
Violations:
- •
resources: {}- empty resources block - •Missing
requests.cpu- scheduler can't make decisions - •Missing
requests.memory- OOM killer may terminate unexpectedly - •Missing
limits.memory- container can consume all node memory - •
requests > limits- invalid configuration
Fix: Define resources.requests.cpu, resources.requests.memory, resources.limits.memory. Note: CPU limits often intentionally omitted for better performance.
4. RBAC Wildcards (HIGH)
RULE: Follow least-privilege principle. Wildcard permissions = privilege escalation path.
Violations:
- •
verbs: ["*"]- grants all actions - •
resources: ["*"]- access to all resource types - •
apiGroups: ["*"]- access across all API groups - •
roleRef.name: cluster-admin- full cluster access - •
verbs: [impersonate]- can act as other users - •
verbs: [escalate, bind]- can grant additional privileges - •Access to
secretsresource - can read all secrets
Fix: Use explicit verbs like [get, list, watch], explicit resources like [pods, services], avoid cluster-admin bindings.
5. Health Probes (HIGH)
RULE: All workloads must have health probes. No probes = stuck containers not restarted + traffic to unready pods.
Violations:
- •Deployment without
livenessProbe- stuck containers won't restart - •Deployment without
readinessProbe- traffic sent to unready pods - •
initialDelaySeconds: 0- probes start immediately, false failures - •
timeoutSeconds: 1- too short, may cause false failures - •
successThreshold > 1on livenessProbe - should always be 1 - •
failureThreshold > 10- delays detecting actual failures
Fix: Add livenessProbe and readinessProbe with reasonable initialDelaySeconds (10-30s), periodSeconds (10s), timeoutSeconds (5s).
6. Helm Lint (HIGH)
RULE: Charts must pass official helm lint validation. Lint failures = deployment failures.
Violations:
- •Template syntax errors
- •Missing required fields in Chart.yaml
- •Invalid YAML structure
- •Broken template references
Fix: Run helm lint <chart-path> and fix reported issues.
7. Chart Metadata (MEDIUM)
RULE: Chart.yaml must have complete metadata. Missing metadata = maintenance nightmare.
Violations:
- •
apiVersion: v1- Helm 2 format, upgrade to v2 - •Missing or invalid
version- must be SemVer - •Missing
appVersion- hard to track what's deployed - •Missing
description- unclear what chart does - •Missing
maintainers- no ownership - •
namedoesn't match directory name - confusing
Fix: Use apiVersion: v2, SemVer version, add description and maintainers with email.
8. Chart Structure (MEDIUM)
RULE: Follow standard Helm chart structure. Non-standard = user confusion + missing features.
Violations:
- •Missing
README.md- no documentation - •Missing
templates/NOTES.txt- no post-install instructions - •Missing
templates/_helpers.tpl- no template helpers - •Missing
.helmignore- unnecessary files in package - •Missing
values.schema.json- no values validation - •Empty
templates/directory
Fix: Create missing files following Helm chart best practices.
9. Dependencies (MEDIUM)
RULE: Pin dependency versions. Floating versions = non-reproducible builds.
Violations:
- •No
versionon dependency - unpinned - •
version: "*"orversion: "^1.0"- floating version - •Missing
Chart.lock- dependency versions not locked - •
repository: file://- local reference, breaks when published - •
repository: http://- insecure, use HTTPS - •Deprecated repository URLs (charts.helm.sh/stable)
Fix: Pin exact versions, run helm dependency update to generate Chart.lock.
10. Deprecated APIs (MEDIUM)
RULE: Use stable Kubernetes APIs. Deprecated APIs = upgrade failures.
Violations:
- •
extensions/v1beta1- removed in K8s 1.22 - •
apps/v1beta1,apps/v1beta2- removed in K8s 1.16 - •
networking.k8s.io/v1beta1Ingress - removed in K8s 1.22 - •
batch/v1beta1CronJob - removed in K8s 1.25 - •
policy/v1beta1PodSecurityPolicy - removed in K8s 1.25
Fix: Update to stable APIs: apps/v1, networking.k8s.io/v1, batch/v1. Run kubectl convert if needed.
11. Argo Rollouts (MEDIUM)
RULE: Rollouts must have valid strategy configuration. Invalid config = failed deployments.
Violations:
- •Rollout without
strategy- no deployment strategy - •Canary without
steps- no gradual rollout - •Canary without
analysis- no automated validation - •BlueGreen without
activeService- no active service defined - •BlueGreen without
previewService- can't preview before promotion - •Missing
revisionHistoryLimit- old ReplicaSets accumulate - •Missing
progressDeadlineSeconds- stuck rollouts don't timeout
Fix: Configure proper canary steps with analysis, or blueGreen with activeService/previewService.
12. Ingress TLS (MEDIUM)
RULE: Ingress must have TLS configuration. No TLS = unencrypted traffic.
Violations:
- •Ingress with hosts but no TLS - traffic unencrypted
- •TLS without
secretName- certificate source unclear - •No
ingressClassName- may use wrong controller - •Missing cert-manager annotations - no automated certificates
- •Deprecated
kubernetes.io/ingress.classannotation - •No SSL redirect annotation - HTTP doesn't redirect to HTTPS
Fix: Add TLS section with secretName, use cert-manager.io/cluster-issuer annotation for automated certs.
13. GPU Resources (LOW)
RULE: GPU workloads need proper configuration. Missing config = scheduling failures.
Violations:
- •GPU limits without matching requests - should be equal
- •No GPU toleration - won't schedule on GPU nodes
- •No GPU nodeSelector/affinity - relies only on resource availability
- •No runtimeClassName - may need nvidia runtime
Fix: Set nvidia.com/gpu in both requests and limits (equal values), add GPU tolerations and nodeSelector.
Detection Philosophy
This skill uses VALUE-BASED detection:
- •Detects issues by actual values and patterns, not by variable/field names
- •Future-proof: new charts with issues are automatically detected
- •No need to update scripts when new charts are added
Parsing Strategy
- •Chart.yaml, values.yaml: YAML content parsed via regex patterns
- •templates/*.yaml: Regex-based parsing (Go template syntax breaks YAML parsers)
- •Multi-document YAML: Handles
---separators
Safety
- •Read-only operation (except report generation)
- •No Helm releases modified
- •No cluster changes