K8s Python/FastAPI Optimization
This skill focuses on resolving common bottlenecks and build errors when deploying Python services to a cluster.
Dockerfile Best Practices
1. Multi-stage Builds & Editable Installs
Never use pip install -e . in a multi-stage Dockerfile where the production stage copies code to a different directory. The symlinks created by -e will break.
Correct Build Stage:
# production install RUN pip install .
2. User Permissions
Always use a non-root user with a fixed UID (e.g., 1000) for security and volume permission consistency.
Application Server Tuning (Gunicorn/Uvicorn)
1. Handling Slow Library Imports
Heavy AI libraries (like openai-agents) can significantly slow down startup. A short Gunicorn timeout will kill the worker before it even starts.
gunicorn.conf.py Optimization:
workers = 1 # Reduce on resource-constrained nodes timeout = 120 # Significant increase for AI/heavy imports preload_app = False # Set to False to debug worker startup errors individually
Networking and Service Binding
1. Sidecar Communication
For Dapr or other sidecar patterns to work, the application must bind to 0.0.0.0, not 127.0.0.1. Inside a pod, localhost is shared, but many frameworks default to binding only to the bridge if not specified.
Next.js / Standalone Fix:
# Set HOSTNAME env to bind correctly ENV HOSTNAME="0.0.0.0"
2. Probe Consistency
Synchronize application startup time with Kubernetes probes. If your app takes 40s to start, an initialDelaySeconds: 10 will lead to unnecessary restarts.
Probe Settings:
- •Liveness:
initialDelaySeconds: 120(Safe for slow nodes) - •Readiness:
initialDelaySeconds: 60