Kubernetes Troubleshoot
Debug pods, services, deployments, and networking issues in Kubernetes.
Instructions
- •Identify the affected resource (pod, service, deployment)
- •Get current state with
kubectl getandkubectl describe - •Check logs if applicable
- •Diagnose based on status/events
- •Provide specific remediation steps
Diagnostic commands
bash
# Pod debugging kubectl get pods -o wide kubectl describe pod <pod> kubectl logs <pod> [--previous] [-c container] kubectl get events --sort-by=.lastTimestamp # Service/networking kubectl get svc,endpoints kubectl describe svc <service> kubectl get ingress # Resource issues kubectl top pods kubectl describe node <node> | grep -A5 "Allocated resources" # Debug pod (ephemeral container) kubectl debug -it <pod> --image=busybox --target=<container>
Common issues
| Status | Cause | Solution |
|---|---|---|
| Pending | No resources | Check node capacity, resource requests |
| Pending | No matching node | Check nodeSelector, taints/tolerations |
| ImagePullBackOff | Bad image/auth | Verify image name, imagePullSecrets |
| CrashLoopBackOff | App crashing | Check logs, entrypoint, health probes |
| CreateContainerConfigError | Bad configmap/secret | Verify referenced configs exist |
| Evicted | Node pressure | Check node conditions, resource limits |
Service not reachable checklist
- •Pod running?
kubectl get pods -l app=<app> - •Pod ready? Check readiness probe
- •Endpoints exist?
kubectl get endpoints <svc> - •Service selector matches pod labels?
- •Port/targetPort correct?
- •NetworkPolicy blocking traffic?
Rules
- •MUST check events with
kubectl describebefore diagnosing - •MUST check logs for CrashLoopBackOff
- •Never delete pods/resources without user approval
- •Never apply changes without showing the diff first
- •Always specify namespace if not default:
-n <namespace>