Kubernetes Deployment
Deploy containerized applications from development to production on Kubernetes with AI-assisted generation and validation.
Quick Start Decision Tree
What do you need?
│
├─ Understand Pods first? → See "Pod Fundamentals"
│
├─ First deployment? → See "Hello World Deployment"
│
├─ Natural language to YAML? → See "AI-Assisted Generation"
│
├─ Expose application?
│ ├─ Internal only → ClusterIP Service
│ ├─ External (dev) → NodePort Service
│ └─ External (prod) → LoadBalancer + Ingress
│
├─ Handle traffic spikes? → HPA Autoscaling
│
├─ Run batch workloads?
│ ├─ One-time task → Job
│ └─ Scheduled task → CronJob
│
├─ Store configuration?
│ ├─ Non-sensitive → ConfigMap
│ └─ Sensitive → Secret
│
├─ Persist data? → PersistentVolumeClaim
│
├─ Production ready? → See "Production Checklist"
│
├─ Validate manifests? → See "Production Readiness Validation"
│
├─ Refine iteratively? → See "Iterative Refinement"
│
└─ Pod not working? → See "Troubleshooting Reference"
├─ CrashLoopBackOff → Check logs --previous
├─ Pending → Check resources/node capacity
├─ OOMKilled → Increase memory limit
└─ ImagePullBackOff → Check image name/auth
Pod Fundamentals
Pods are the smallest deployable units in Kubernetes. Understanding Pod structure is essential before working with Deployments.
Standalone Pod (Basic)
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
version: v1
spec:
containers:
- name: app
image: nginx:1.25
ports:
- containerPort: 80
Pod with Resource Requests and Limits
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: app
image: python:3.11-alpine
command: ["python", "-m", "http.server", "8000"]
ports:
- containerPort: 8000
resources:
requests:
cpu: 100m # Minimum guaranteed (scheduling)
memory: 64Mi # Minimum guaranteed
limits:
cpu: 500m # Maximum allowed (throttled if exceeded)
memory: 256Mi # Maximum allowed (OOMKilled if exceeded)
Resource Units Quick Reference:
| Type | Unit | Examples |
|---|---|---|
| CPU | millicores | 100m = 0.1 CPU, 1000m = 1 CPU |
| Memory | binary | 64Mi, 256Mi, 1Gi |
Multi-Container Pod (Sidecar Pattern)
apiVersion: v1
kind: Pod
metadata:
name: web-with-logging
labels:
app: web
spec:
containers:
# Main application container
- name: web
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
volumeMounts:
- name: shared-logs
mountPath: /var/log/nginx
# Sidecar: log collector
- name: log-collector
image: busybox:1.36
command: ["sh", "-c", "tail -F /logs/access.log"]
resources:
requests:
cpu: 50m
memory: 32Mi
limits:
cpu: 100m
memory: 64Mi
volumeMounts:
- name: shared-logs
mountPath: /logs
volumes:
- name: shared-logs
emptyDir: {}
Init Container Pattern
apiVersion: v1
kind: Pod
metadata:
name: app-with-init
spec:
initContainers:
- name: init-db-check
image: busybox:1.36
command: ['sh', '-c', 'until nc -z db-service 5432; do sleep 2; done']
resources:
requests:
cpu: 50m
memory: 32Mi
limits:
cpu: 100m
memory: 64Mi
containers:
- name: app
image: myapp:1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
Pod Networking Basics
┌─────────────────────────────────────────────────────┐ │ Pod (shared network namespace) │ │ IP: 10.1.0.15 │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Container A │ │ Container B │ │ Container C │ │ │ │ :8080 │ │ :9090 │ │ :3000 │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ Containers communicate via localhost:PORT │ └─────────────────────────────────────────────────────┘
Key networking facts:
- •All containers in a Pod share the same IP address
- •Containers communicate via
localhost:<port> - •Each container must use a unique port
- •Pod IP is accessible from other Pods in the cluster
Pod Commands
# Create pod from manifest kubectl apply -f pod.yaml # List pods with resource info kubectl get pods -o wide # View pod details and events kubectl describe pod myapp-pod # View pod logs kubectl logs myapp-pod kubectl logs myapp-pod -c log-collector # specific container # Execute command in pod kubectl exec -it myapp-pod -- /bin/sh kubectl exec -it myapp-pod -c web -- /bin/sh # specific container # Check resource usage (requires metrics-server) kubectl top pod myapp-pod # Delete pod kubectl delete pod myapp-pod
Note: For production workloads, use Deployments instead of standalone Pods. Deployments provide self-healing, scaling, and rolling updates.
Hello World Deployment
Minimal Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: app
image: nginx:latest
ports:
- containerPort: 80
Expose with Service
apiVersion: v1
kind: Service
metadata:
name: hello-world
spec:
type: LoadBalancer
selector:
app: hello-world
ports:
- port: 80
targetPort: 80
Deploy Commands
# Apply manifests kubectl apply -f deployment.yaml kubectl apply -f service.yaml # Verify kubectl get deployments kubectl get pods kubectl get services # Access logs kubectl logs -l app=hello-world # Delete kubectl delete -f deployment.yaml -f service.yaml
Standard Deployment Workflow
1. Create Namespace
apiVersion: v1 kind: Namespace metadata: name: myapp
2. Create ConfigMap (if needed)
apiVersion: v1 kind: ConfigMap metadata: name: myapp-config namespace: myapp data: APP_ENV: production LOG_LEVEL: info
3. Create Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: myapp
spec:
replicas: 2
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: app
image: myapp:1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
envFrom:
- configMapRef:
name: myapp-config
4. Create Service
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: myapp
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 8080
5. Create Ingress (for external access)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
namespace: myapp
spec:
ingressClassName: nginx
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
Scaling
Manual Scaling
kubectl scale deployment myapp --replicas=5 -n myapp
Auto Scaling with HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
namespace: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Prerequisite: Resource requests must be defined. Install Metrics Server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Deployment Strategies
Rolling Update (Default)
Gradually replaces old Pods with new ones, ensuring zero downtime.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max pods over desired count during update
maxUnavailable: 1 # Max pods that can be unavailable during update
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: app
image: myapp:2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
Rolling Update Behavior:
replicas: 3, maxSurge: 1, maxUnavailable: 1 Step 1: [v1] [v1] [v1] ← Initial state (3 pods) Step 2: [v1] [v1] [v2] [v2] ← Create 1 new, terminate 1 old (maxSurge=1) Step 3: [v1] [v2] [v2] [v2] ← Continue rolling Step 4: [v2] [v2] [v2] ← Complete (3 pods running v2)
Recreate Strategy
Terminates all existing Pods before creating new ones. Causes downtime but ensures no version mixing.
spec:
strategy:
type: Recreate
When to use Recreate:
- •Application cannot run multiple versions simultaneously
- •Database migrations requiring exclusive access
- •Stateful applications with strict consistency requirements
Strategy Comparison
| Aspect | RollingUpdate | Recreate |
|---|---|---|
| Downtime | None | Yes |
| Resource usage | Higher (runs both versions) | Lower |
| Rollback speed | Fast | Slow |
| Version mixing | Yes (briefly) | No |
ReplicaSet Management
Deployments create and manage ReplicaSets automatically.
Deployment (myapp)
│
├── ReplicaSet (myapp-7d9f8b6c5) ← Current (3 replicas)
│ ├── Pod (myapp-7d9f8b6c5-abc12)
│ ├── Pod (myapp-7d9f8b6c5-def34)
│ └── Pod (myapp-7d9f8b6c5-ghi56)
│
└── ReplicaSet (myapp-5c4d3b2a1) ← Previous (0 replicas, kept for rollback)
# View ReplicaSets kubectl get rs -n myapp # View ReplicaSet details kubectl describe rs myapp-7d9f8b6c5 -n myapp # Control revision history (in Deployment spec) spec: revisionHistoryLimit: 10 # Number of old ReplicaSets to retain
Self-Healing Mechanisms
Kubernetes automatically maintains the desired state:
| Failure | Response |
|---|---|
| Pod crashes | ReplicaSet creates replacement Pod |
| Pod deleted | ReplicaSet creates replacement Pod |
| Node fails | Pods rescheduled to healthy nodes |
| Container OOMKilled | Pod restarts based on restartPolicy |
| Health check fails | Pod restarted or removed from Service |
# Watch self-healing in action kubectl get pods -w -n myapp # Delete a pod (watch it recreate) kubectl delete pod myapp-7d9f8b6c5-abc12 -n myapp
Updates and Rollbacks
Update Methods
# Method 1: Update image directly
kubectl set image deployment/myapp app=myapp:2.0 -n myapp
# Method 2: Edit deployment
kubectl edit deployment myapp -n myapp
# Method 3: Apply updated manifest (recommended)
kubectl apply -f deployment.yaml
# Method 4: Patch specific fields
kubectl patch deployment myapp -n myapp -p '{"spec":{"template":{"spec":{"containers":[{"name":"app","image":"myapp:2.0"}]}}}}'
Monitor Rollout
# Watch rollout progress kubectl rollout status deployment/myapp -n myapp # View rollout history kubectl rollout history deployment/myapp -n myapp # View specific revision details kubectl rollout history deployment/myapp -n myapp --revision=2
Rollback
# Rollback to previous revision kubectl rollout undo deployment/myapp -n myapp # Rollback to specific revision kubectl rollout undo deployment/myapp -n myapp --to-revision=2 # Verify rollback kubectl rollout status deployment/myapp -n myapp
Pause and Resume (for batched changes)
# Pause rollout kubectl rollout pause deployment/myapp -n myapp # Apply multiple changes without triggering rollout kubectl set image deployment/myapp app=myapp:2.0 -n myapp kubectl set resources deployment/myapp -c app --limits=cpu=500m,memory=512Mi -n myapp kubectl set env deployment/myapp -c app APP_ENV=production -n myapp # Resume to trigger single rollout with all changes kubectl rollout resume deployment/myapp -n myapp
Jobs and CronJobs
For batch processing and scheduled tasks, use Jobs and CronJobs instead of Deployments.
Job (One-time Task)
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
spec:
completions: 5 # Total successful completions needed
parallelism: 2 # Run 2 pods concurrently
backoffLimit: 3 # Max retries before failure
activeDeadlineSeconds: 600 # Timeout after 10 minutes
ttlSecondsAfterFinished: 300 # Auto-cleanup after 5 min
template:
spec:
restartPolicy: OnFailure # Required: Never or OnFailure
containers:
- name: processor
image: myapp:1.0
command: ["python", "process.py"]
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
CronJob (Scheduled Task)
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-cleanup
spec:
schedule: "0 2 * * *" # Daily at 2 AM
timeZone: "America/New_York" # K8s 1.27+
concurrencyPolicy: Forbid # Skip if previous still running
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
backoffLimit: 2
activeDeadlineSeconds: 3600
template:
spec:
restartPolicy: OnFailure
containers:
- name: cleanup
image: cleanup:1.0
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
Cron Schedule Quick Reference:
| Schedule | Meaning |
|---|---|
*/15 * * * * | Every 15 minutes |
0 * * * * | Every hour |
0 2 * * * | Daily at 2 AM |
0 0 * * 0 | Weekly on Sunday |
Job Commands
# Create and monitor Job
kubectl apply -f job.yaml
kubectl get jobs -w
kubectl logs job/data-processor
# Manually trigger CronJob
kubectl create job manual-run --from=cronjob/daily-cleanup
# Suspend/resume CronJob
kubectl patch cronjob daily-cleanup -p '{"spec":{"suspend":true}}'
Essential Commands
# Apply/Update resources kubectl apply -f manifest.yaml # Get resources kubectl get pods,deployments,services -n myapp # Describe (detailed info) kubectl describe deployment myapp -n myapp # Logs kubectl logs -l app=myapp -n myapp --tail=100 # Exec into pod kubectl exec -it <pod-name> -n myapp -- /bin/sh # Port forward for local testing kubectl port-forward svc/myapp 8080:80 -n myapp # Delete resources kubectl delete -f manifest.yaml
Reference Documentation
Detailed reference for each topic:
| Topic | Reference File |
|---|---|
| Cluster & Context Management | references/cluster-management.md |
| Architecture & Reconciliation | references/architecture.md |
| Pods | references/pods.md |
| Deployments | references/deployments.md |
| Services | references/services.md |
| ConfigMaps & Secrets | references/configuration.md |
| Scaling & HPA | references/scaling.md |
| Jobs & CronJobs | references/jobs.md |
| Ingress | references/ingress.md |
| Health Probes | references/health-probes.md |
| Resource Limits | references/resources.md |
| Storage (PV/PVC) | references/storage.md |
| Namespaces | references/namespaces.md |
| Security | references/security.md |
| Production Checklist | references/production-checklist.md |
| Troubleshooting | references/troubleshooting.md |
AI-Assisted Manifest Generation
Generate Kubernetes manifests from natural language descriptions using AI-powered parsing and generation.
Natural Language to YAML Translation
Convert plain English descriptions directly to valid Kubernetes YAML:
# Generate deployment from natural language python scripts/natural-language-generator.py --describe "Deploy nginx web server with 3 replicas, expose on port 80" # Generate with additional features python scripts/natural-language-generator.py --describe "Create production API with 5 replicas, enable HPA and ingress" --validate # Refine existing manifest with feedback python scripts/natural-language-generator.py --refine existing.yaml --describe "Add security context and resource limits"
Supported Natural Language Patterns
| Pattern | Example |
|---|---|
| Replicas | "with 3 replicas", "3 instances", "scale to 5 pods" |
| Ports | "expose on port 8080", "listen on 3000" |
| Images | "use nginx:1.25", "container image redis:7" |
| Services | "with LoadBalancer", "expose externally" |
| HPA | "enable auto-scaling", "with HPA" |
| Production | "production hardened", "secure deployment" |
| Resources | "100m CPU, 256Mi memory" |
Example Conversions
Input: "Deploy a web app called myapi using image myapp:1.0 with 3 replicas, expose via LoadBalancer on port 8080, enable HPA"
Output:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapi
namespace: default
labels:
app: myapi
spec:
replicas: 3
selector:
matchLabels:
app: myapi
template:
metadata:
labels:
app: myapi
spec:
containers:
- name: myapi
image: myapp:1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: myapi
namespace: default
spec:
type: LoadBalancer
selector:
app: myapi
ports:
- port: 80
targetPort: 8080
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapi-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapi
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
Iterative Refinement Workflows
Improve Kubernetes manifests through feedback-driven iteration cycles.
Refinement Process
- •Generate Initial Manifest - Create base configuration from requirements
- •Validate & Evaluate - Run automated checks and scoring
- •Gather Feedback - Collect input from validation tools and humans
- •Apply Improvements - Integrate feedback into manifest
- •Repeat - Continue until quality targets are met
Common Refinement Patterns
| Feedback Category | Example Feedback | Applied Change |
|---|---|---|
| Security | "Add security context" | Add runAsNonRoot, readOnlyRootFilesystem |
| Resources | "Add resource limits" | Add CPU/memory requests and limits |
| Availability | "Increase replicas" | Change replica count from 1 to 3 |
| Scaling | "Enable HPA" | Add HorizontalPodAutoscaler resource |
| Networking | "Change service type" | Switch from ClusterIP to LoadBalancer |
Refinement Commands
# Refine manifest with specific feedback python scripts/natural-language-generator.py --refine deployment.yaml --describe "Add production security settings" # Apply multiple refinements iteratively python scripts/natural-language-generator.py --refine deployment.yaml --describe "Add resource limits and health checks" --output refined.yaml
Production Readiness Validation
Comprehensive validation to ensure manifests meet production standards.
Validation Categories
Security Validation
- •Container runs as non-root
- •Privilege escalation disabled
- •Capabilities properly restricted
- •Images from trusted sources
Resource Validation
- •CPU and memory requests/limits defined
- •Resource ratios appropriate
- •No unlimited resources
Availability Validation
- •Sufficient replica count
- •Health checks configured
- •Proper deployment strategy
Best Practices Validation
- •Proper labeling
- •Immutable tags used
- •Appropriate service types
Validation Commands
# Validate single manifest file ./scripts/production-readiness-validator.sh deployment.yaml # Validate and score manifest ./scripts/production-readiness-validator.sh deployment.yaml # Output: Production Readiness Score: 92/100 # Traditional validation after deployment ./scripts/validate-deployment.sh myapp production
Critical Evaluation Checklist
Use the comprehensive checklist to evaluate production readiness:
- • Security Assessment (container, network, RBAC)
- • Resource Management (requests, limits, QoS)
- • Availability and Reliability (probes, strategy, scaling)
- • Configuration Management (secrets, naming, labels)
- • Observability (logging, monitoring, tracing)
- • Production Readiness (backup, lifecycle, testing)
- • Performance Optimization (efficiency, network)
- • Documentation and Maintenance
See references/critical-evaluation-checklist.md for the complete checklist.
Scripts
| Script | Purpose |
|---|---|
| scripts/generate-manifest.py | Generate Kubernetes manifests interactively |
| scripts/validate-deployment.sh | Validate deployment readiness |
| scripts/natural-language-generator.py | AI-assisted manifest generation from natural language |
| scripts/production-readiness-validator.sh | Production readiness validation and scoring |