Grafana Dashboard Validation Skill
Pre-deployment validation for Grafana dashboards to catch errors before they reach production.
When to Use This Skill
Trigger Patterns:
- •User edits files in
monitoring/grafana/dashboards/ - •User mentions "deploy dashboard" or "sync grafana"
- •User creates/modifies
.jsonfiles with Grafana panel structure - •Before any Grafana deployment
- •When troubleshooting dashboard issues
Use Proactively When:
- •You see dashboard JSON being edited
- •You're about to sync dashboards to production
- •User reports "dashboard not working" issues
- •Creating new dashboards from scratch
Core Validation Workflow
1. Quick Check (During Development)
bash
# After any dashboard edit python scripts/validate_dashboard.py --quick path/to/dashboard.json
2. Full Validation (Before Deploy)
bash
# Validate all dashboards against live Prometheus python scripts/validate_dashboard.py --all # Single dashboard with full checks python scripts/validate_dashboard.py path/to/dashboard.json
3. CI/CD Validation
bash
# Quick validation with JSON output python scripts/validate_dashboard.py --quick --json --all # Parse results python scripts/validate_dashboard.py --quick --json --all | jq '.[] | select(.success==false)'
Validation Checks
Syntax Level (Always Run)
- •✅ Valid JSON structure
- •✅ Required fields (title, panels)
- •✅ Balanced parentheses/brackets in queries
- •✅ Panel type validity
- •✅ Datasource UID presence
Runtime Level (Full Mode Only)
- •✅ PromQL queries execute successfully
- •✅ Metrics exist in Prometheus
- •✅ Queries return data (warning if empty)
- •✅ Variable queries are valid
- •✅ SQL syntax is valid
What's NOT Validated
- •❌ Visual appearance
- •❌ Panel layout/positioning
- •❌ User permissions
- •❌ Plugin availability
Implementation Pattern
For New Projects
- •Copy validation script:
bash
# Copy from this skill to project cp ~/.claude/skills/grafana-dashboard-validation/validate_dashboard.py scripts/ # Or download latest version curl -o scripts/validate_dashboard.py \ https://raw.githubusercontent.com/YOUR_ORG/grafana-tools/main/validate_dashboard.py
- •Create sync script with validation:
bash
#!/bin/bash
# scripts/sync_grafana.sh
# Always validate first
python scripts/validate_dashboard.py --all || {
echo "❌ Validation failed. Fix errors or use --force to override"
exit 1
}
# Then sync
rsync -av monitoring/grafana/ user@server:/path/to/grafana/
- •Add pre-commit hook:
bash
# .git/hooks/pre-commit if git diff --cached --name-only | grep -q "dashboards.*\.json"; then python scripts/validate_dashboard.py --quick --all || exit 1 fi
For Existing Projects
Check if validation exists:
bash
# Look for validator ls -la scripts/validate_dashboard.py 2>/dev/null || echo "No validator found" # If missing, add it if [ ! -f scripts/validate_dashboard.py ]; then echo "📝 Adding dashboard validator to project..." cp ~/.claude/skills/grafana-dashboard-validation/validate_dashboard.py scripts/ fi
Common Issues and Fixes
Issue: "Metric not found"
python
# Check if metric exists
curl -s http://prometheus:9090/api/v1/label/__name__/values | jq '.data[] | select(contains("your_metric"))'
# Common causes:
# 1. Metric not yet created (deploy app first)
# 2. Wrong metric name (check spelling)
# 3. Prometheus not scraping (check targets)
Issue: "Query returns no data"
python
# Test query directly curl -g 'http://prometheus:9090/api/v1/query?query=your_promql_here' # Common causes: # 1. No data in time range (normal for new deploy) # 2. Wrong label filters # 3. Missing service
Issue: "Invalid PromQL"
python
# Validate syntax promtool check query "your_promql_here" # Common issues: # 1. Unbalanced parentheses # 2. Invalid function names # 3. Wrong aggregation syntax
Error Severity Guide
Must Fix (Errors):
- •Invalid JSON - Dashboard won't load
- •Invalid PromQL syntax - Panel will error
- •Missing datasource UID - Panel can't query
Should Fix (Warnings):
- •Metric doesn't exist - Panel will be empty
- •Query returns no data - Might be normal
- •Hardcoded dates in SQL - Will become stale
Informational:
- •Template variables - Can't fully validate
- •Hidden queries - Might be intentional
- •Non-standard UIDs - Still works
Integration Examples
GitHub Actions
yaml
name: Validate Dashboards
on:
pull_request:
paths:
- 'monitoring/grafana/dashboards/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- run: pip install requests
- run: python scripts/validate_dashboard.py --quick --json --all
GitLab CI
yaml
validate-dashboards:
stage: test
script:
- python scripts/validate_dashboard.py --quick --all
only:
changes:
- monitoring/grafana/dashboards/**
Pre-push Enforcement
bash
# .git/hooks/pre-push
#!/bin/bash
if git diff --name-only origin/main HEAD | grep -q "dashboards.*\.json"; then
echo "Validating dashboards before push..."
python scripts/validate_dashboard.py --all || {
echo "Push blocked: Dashboard validation failed"
exit 1
}
fi
Proactive Patterns for Claude
When User Edits Dashboard
python
# After editing any dashboard.json
print("I'll validate that dashboard to catch any issues...")
subprocess.run(["python", "scripts/validate_dashboard.py", "--quick", "path/to/dashboard.json"])
Before Deployment
python
# Always run before sync
print("Let me validate all dashboards before deploying...")
result = subprocess.run(["python", "scripts/validate_dashboard.py", "--all"])
if result.returncode != 0:
print("Found issues - let me help fix them...")
When Creating New Dashboard
python
# After generating dashboard JSON
with open("new_dashboard.json", "w") as f:
json.dump(dashboard_config, f, indent=2)
# Immediately validate
print("Validating the generated dashboard...")
subprocess.run(["python", "scripts/validate_dashboard.py", "--quick", "new_dashboard.json"])
Performance Considerations
- •Quick mode: <1 second (no network)
- •Full mode: 2-10 seconds (depends on query count)
- •All dashboards: Multiply by dashboard count
For large deployments (>50 dashboards), use quick mode in CI and full mode manually.
Skill Dependencies
Required Python packages:
- •
requests(for Prometheus API) - •
json(standard library) - •
pathlib(standard library) - •
argparse(standard library)
No external tools required for quick mode.
Summary
This skill provides fail-fast feedback for Grafana dashboards:
- •Catches 80% of issues before deployment
- •Two modes for different scenarios
- •CI/CD ready with JSON output
- •Proactive validation during development
Always validate before deploying. The 30 seconds spent validating saves hours of debugging broken dashboards in production.