AgentSkillsCN

k8s-python-fastapi-optimization

在 Kubernetes 中,掌握 Python FastAPI 应用的容器化与部署最佳实践,包括多阶段构建与 Gunicorn 调优。

SKILL.md
--- frontmatter
name: k8s-python-fastapi-optimization
description: Best practices for containerizing and deploying Python FastAPI applications in Kubernetes, including multi-stage builds and Gunicorn tuning.

K8s Python/FastAPI Optimization

This skill focuses on resolving common bottlenecks and build errors when deploying Python services to a cluster.

Dockerfile Best Practices

1. Multi-stage Builds & Editable Installs

Never use pip install -e . in a multi-stage Dockerfile where the production stage copies code to a different directory. The symlinks created by -e will break.

Correct Build Stage:

dockerfile
# production install
RUN pip install .

2. User Permissions

Always use a non-root user with a fixed UID (e.g., 1000) for security and volume permission consistency.

Application Server Tuning (Gunicorn/Uvicorn)

1. Handling Slow Library Imports

Heavy AI libraries (like openai-agents) can significantly slow down startup. A short Gunicorn timeout will kill the worker before it even starts.

gunicorn.conf.py Optimization:

python
workers = 1  # Reduce on resource-constrained nodes
timeout = 120  # Significant increase for AI/heavy imports
preload_app = False  # Set to False to debug worker startup errors individually

Networking and Service Binding

1. Sidecar Communication

For Dapr or other sidecar patterns to work, the application must bind to 0.0.0.0, not 127.0.0.1. Inside a pod, localhost is shared, but many frameworks default to binding only to the bridge if not specified.

Next.js / Standalone Fix:

bash
# Set HOSTNAME env to bind correctly
ENV HOSTNAME="0.0.0.0"

2. Probe Consistency

Synchronize application startup time with Kubernetes probes. If your app takes 40s to start, an initialDelaySeconds: 10 will lead to unnecessary restarts.

Probe Settings:

  • Liveness: initialDelaySeconds: 120 (Safe for slow nodes)
  • Readiness: initialDelaySeconds: 60