Helm & Kubernetes Deployment (Research-Driven)
Philosophy
This skill does NOT dump logs into context. Instead, it guides you to:
- •Research the current Helm/K8s state efficiently
- •Extract only relevant log lines (not full logs)
- •Diagnose issues with targeted commands
- •Preserve context by summarising rather than copying
CRITICAL: Context-Efficient Log Analysis
NEVER dump full logs into context. Instead:
bash
# ✅ GOOD: Get last 20 lines with errors only kubectl logs <pod> --tail=20 2>&1 | grep -i "error\|fail\|exception" # ✅ GOOD: Get events (more useful than logs for debugging) kubectl get events --sort-by='.lastTimestamp' | tail -20 # ✅ GOOD: Check pod status first (often enough) kubectl get pods -o wide # ❌ BAD: Full log dump (burns context) kubectl logs <pod>
Pre-Implementation Research Protocol
Step 1: Verify Cluster State
ALWAYS run this first (small output, high signal):
bash
# Quick cluster health check kubectl cluster-info 2>&1 | head -5 # Check namespace pods status kubectl get pods -n <namespace> -o wide # Recent events (usually reveals issues) kubectl get events --sort-by='.lastTimestamp' -n <namespace> | tail -15
Step 2: Helm Chart Validation (Before Deploy)
bash
# Lint chart helm lint charts/<chart-name> # Dry-run template rendering helm template charts/<chart-name> --debug 2>&1 | head -100 # Validate manifests helm template charts/<chart-name> | kubectl apply --dry-run=client -f -
Step 3: Targeted Debugging (Context-Efficient)
For pod issues, use this escalation:
- •
Status check (no logs needed):
bashkubectl describe pod <pod> | grep -A 20 "Events:"
- •
Recent logs only:
bashkubectl logs <pod> --tail=30 --since=5m
- •
Error extraction:
bashkubectl logs <pod> 2>&1 | grep -i "error\|exception\|fatal" | tail -20
- •
Container-specific (for multi-container pods):
bashkubectl logs <pod> -c <container> --tail=20
Floe-Runtime Chart Structure
code
charts/ ├── floe-runtime/ # Umbrella chart ├── floe-dagster/ # Dagster webserver/daemon ├── floe-cube/ # Cube semantic layer └── floe-infrastructure/ # PostgreSQL, MinIO, Polaris
Common Debugging Patterns
| Symptom | First Command | Not Full Logs |
|---|---|---|
| Pod CrashLoopBackOff | kubectl describe pod <x> | grep -A10 Events | Don't dump logs |
| Pod Pending | kubectl describe pod <x> | grep -A5 Conditions | Check resources |
| ImagePullBackOff | kubectl describe pod <x> | grep -A3 Warning | Check image name |
| Service not reachable | kubectl get endpoints <svc> | Check selectors |
| Helm install fails | helm install --debug --dry-run 2>&1 | tail -50 | Don't dump all |
Context Injection (For Subagent Delegation)
When spawning the docker-log-analyser agent:
markdown
Analyse logs for [pod-name] focusing on: - Startup failures - Connection errors to [service] - Specific error: [paste only the error line, not full log] Return ONLY: 1. Root cause (1-2 sentences) 2. Suggested fix 3. Commands to verify fix
Quick Reference: Common Research Queries
WebSearch patterns (use when unfamiliar):
- •"Helm [chart-name] values.yaml reference 2025"
- •"Kubernetes [error-message] troubleshooting"
- •"Dagster Helm chart configuration 2025"
Integration with Floe Skills
| When working on... | Also consider... |
|---|---|
| Dagster deployment | dagster-skill (for asset config) |
| Cube deployment | cube-skill (for API endpoints) |
| Polaris in K8s | polaris-skill (for catalog config) |
Summary: Context Preservation Rules
- •Never dump full logs — extract error lines only
- •Use
kubectl describebeforekubectl logs - •Use
--tail=Non all log commands - •Delegate to docker-log-analyser agent for deep analysis
- •Summarise findings rather than pasting output