Kubernetes Troubleshoot

Name: k8s-troubleshoot
Rating: 92
Author: mhalder

Debug pods, services, deployments, and networking issues in Kubernetes.

Instructions

•Identify the affected resource (pod, service, deployment)
•Get current state with kubectl get and kubectl describe
•Check logs if applicable
•Diagnose based on status/events
•Provide specific remediation steps

Diagnostic commands

bash

# Pod debugging
kubectl get pods -o wide
kubectl describe pod <pod>
kubectl logs <pod> [--previous] [-c container]
kubectl get events --sort-by=.lastTimestamp

# Service/networking
kubectl get svc,endpoints
kubectl describe svc <service>
kubectl get ingress

# Resource issues
kubectl top pods
kubectl describe node <node> | grep -A5 "Allocated resources"

# Debug pod (ephemeral container)
kubectl debug -it <pod> --image=busybox --target=<container>

Common issues

Status	Cause	Solution
Pending	No resources	Check node capacity, resource requests
Pending	No matching node	Check nodeSelector, taints/tolerations
ImagePullBackOff	Bad image/auth	Verify image name, imagePullSecrets
CrashLoopBackOff	App crashing	Check logs, entrypoint, health probes
CreateContainerConfigError	Bad configmap/secret	Verify referenced configs exist
Evicted	Node pressure	Check node conditions, resource limits

Service not reachable checklist

•Pod running? kubectl get pods -l app=<app>
•Pod ready? Check readiness probe
•Endpoints exist? kubectl get endpoints <svc>
•Service selector matches pod labels?
•Port/targetPort correct?
•NetworkPolicy blocking traffic?

Rules

•MUST check events with kubectl describe before diagnosing
•MUST check logs for CrashLoopBackOff
•Never delete pods/resources without user approval
•Never apply changes without showing the diff first
•Always specify namespace if not default: -n <namespace>