K8s Python/FastAPI Optimization

This skill focuses on resolving common bottlenecks and build errors when deploying Python services to a cluster.

Dockerfile Best Practices

1. Multi-stage Builds & Editable Installs

Never use pip install -e . in a multi-stage Dockerfile where the production stage copies code to a different directory. The symlinks created by -e will break.

Correct Build Stage:

dockerfile

# production install
RUN pip install .

2. User Permissions

Always use a non-root user with a fixed UID (e.g., 1000) for security and volume permission consistency.

Application Server Tuning (Gunicorn/Uvicorn)

1. Handling Slow Library Imports

Heavy AI libraries (like openai-agents) can significantly slow down startup. A short Gunicorn timeout will kill the worker before it even starts.

gunicorn.conf.py Optimization:

python

workers = 1  # Reduce on resource-constrained nodes
timeout = 120  # Significant increase for AI/heavy imports
preload_app = False  # Set to False to debug worker startup errors individually

Networking and Service Binding

1. Sidecar Communication

For Dapr or other sidecar patterns to work, the application must bind to 0.0.0.0, not 127.0.0.1. Inside a pod, localhost is shared, but many frameworks default to binding only to the bridge if not specified.

Next.js / Standalone Fix:

bash

# Set HOSTNAME env to bind correctly
ENV HOSTNAME="0.0.0.0"

2. Probe Consistency

Synchronize application startup time with Kubernetes probes. If your app takes 40s to start, an initialDelaySeconds: 10 will lead to unnecessary restarts.

Probe Settings:

•Liveness: initialDelaySeconds: 120 (Safe for slow nodes)
•Readiness: initialDelaySeconds: 60