AgentSkillsCN

kubespray-ha-configuration

在搭建高可用性 Kubernetes 集群、配置多个控制平面节点、确定 etcd Quorum 大小,或为 API 服务器访问实施负载均衡时使用。

SKILL.md
--- frontmatter
name: kubespray-ha-configuration
description: Use when setting up high availability Kubernetes clusters, configuring multiple control plane nodes, etcd quorum sizing, or implementing load balancing for API server access.

Kubespray HA Configuration

Overview

High availability in Kubernetes requires redundant control plane nodes and etcd members. Kubespray automates HA setup but understanding the architecture is essential for proper sizing and troubleshooting.

Core principle: API servers are active-active (all handle requests). Controller-manager and scheduler use leader election (one active, others standby). etcd requires quorum (majority must agree).

When to Use

  • Deploying production clusters requiring uptime guarantees
  • Sizing etcd clusters (why odd numbers matter)
  • Configuring load balancing for API server
  • Understanding failover behavior

Not for: Single-node deployments (use kubespray-deployment), troubleshooting HA failures (use kubespray-troubleshooting)

HA Architecture

code
┌─────────────────────────────────────────────────────────┐
│                    Control Plane HA                      │
├─────────────────────────────────────────────────────────┤
│  API Server:        Active-Active (all handle requests) │
│  Controller-Mgr:    Active-Standby (leader election)    │
│  Scheduler:         Active-Standby (leader election)    │
│  etcd:              Raft consensus (quorum required)    │
└─────────────────────────────────────────────────────────┘

etcd Quorum Sizing

NodesQuorumTolerated Failures
110
321
532
743

Always use odd numbers. With 4 nodes, quorum is 3 - you can still only lose 1 node (same as 3 nodes). The extra node adds complexity without fault tolerance.

Inventory for HA Cluster

ini
[all]
k8s-ctr1 ansible_host=192.168.10.11 ip=192.168.10.11
k8s-ctr2 ansible_host=192.168.10.12 ip=192.168.10.12
k8s-ctr3 ansible_host=192.168.10.13 ip=192.168.10.13
k8s-w1 ansible_host=192.168.10.21 ip=192.168.10.21
k8s-w2 ansible_host=192.168.10.22 ip=192.168.10.22

[kube_control_plane]
k8s-ctr1
k8s-ctr2
k8s-ctr3

[etcd]
k8s-ctr1
k8s-ctr2
k8s-ctr3

[kube_node]
k8s-w1
k8s-w2

[k8s_cluster:children]
kube_control_plane
kube_node

This is stacked etcd - etcd runs on control plane nodes. For external etcd, use separate nodes in [etcd] group.

Load Balancing Options

Option 1: Client-Side Load Balancing (Default)

Kubespray deploys nginx/haproxy on each worker node:

code
Worker Node
    └── nginx (localhost:6443)
           ├── → k8s-ctr1:6443
           ├── → k8s-ctr2:6443
           └── → k8s-ctr3:6443

kubelet connects to localhost:6443, nginx distributes to healthy API servers.

Pros: No external infrastructure, no single point of failure Cons: Only for internal cluster traffic

Configure in group_vars/all/all.yml:

yaml
loadbalancer_apiserver_type: nginx  # or haproxy

Option 2: kube-vip (Software VIP)

Creates floating virtual IP that moves to healthy control plane.

Prerequisites:

  • ARP mode: All nodes must be on same Layer 2 broadcast domain (typical for on-prem, NOT supported in most public clouds like AWS, GCP, Azure)
  • BGP mode: Requires BGP-capable network infrastructure (routers that speak BGP)
yaml
# group_vars/k8s_cluster/k8s-cluster.yml
kube_vip_enabled: true
kube_vip_arp_enabled: true      # Use ARP mode (Layer 2)
# kube_vip_bgp_enabled: true    # Use BGP mode instead
kube_vip_address: 192.168.10.100  # unused IP for VIP

Pros: Single endpoint for external access, no external LB needed Cons: ARP requires Layer 2 network, BGP requires network infrastructure support

Option 3: External Load Balancer

For cloud or enterprise environments:

  • AWS: NLB/ALB
  • GCP: GCP Load Balancer
  • On-prem: HAProxy + keepalived

Configure endpoint in inventory:

yaml
# group_vars/all/all.yml
loadbalancer_apiserver:
  address: 192.168.10.100
  port: 6443

Kubespray does NOT provision external LBs - use Terraform or manual setup.

Verifying HA Setup

Check Leader Election

bash
kubectl get lease -n kube-system
# Shows kube-controller-manager and kube-scheduler leases
# holderIdentity shows current leader

Check etcd Cluster Health

bash
ETCDCTL_API=3 etcdctl \
  --cacert=/etc/ssl/etcd/ssl/ca.pem \
  --cert=/etc/ssl/etcd/ssl/admin-$(hostname).pem \
  --key=/etc/ssl/etcd/ssl/admin-$(hostname)-key.pem \
  --endpoints=https://192.168.10.11:2379,https://192.168.10.12:2379,https://192.168.10.13:2379 \
  endpoint health

Check etcd Leader

bash
ETCDCTL_API=3 etcdctl \
  --cacert=/etc/ssl/etcd/ssl/ca.pem \
  --cert=/etc/ssl/etcd/ssl/admin-$(hostname).pem \
  --key=/etc/ssl/etcd/ssl/admin-$(hostname)-key.pem \
  --endpoints=https://192.168.10.11:2379 \
  endpoint status --write-out=table

Testing Failover

Control Plane Failover

bash
# Stop kubelet on one control plane
ssh k8s-ctr1 "systemctl stop kubelet"

# Verify cluster still works
kubectl get nodes   # ctr1 shows NotReady
kubectl run test --image=nginx  # should succeed

# Restore
ssh k8s-ctr1 "systemctl start kubelet"

etcd Failover

bash
# Stop etcd on one node (WITH BACKUP FIRST)
ssh k8s-ctr1 "systemctl stop etcd"

# Check remaining cluster health
ETCDCTL_API=3 etcdctl endpoint health \
  --endpoints=https://192.168.10.12:2379,https://192.168.10.13:2379 \
  --cacert=/etc/ssl/etcd/ssl/ca.pem \
  --cert=/etc/ssl/etcd/ssl/admin-k8s-ctr2.pem \
  --key=/etc/ssl/etcd/ssl/admin-k8s-ctr2-key.pem

# Verify writes work
kubectl create configmap test-ha --from-literal=key=value

# Restore
ssh k8s-ctr1 "systemctl start etcd"

Stacked vs External etcd

AspectStackedExternal
Nodes required3 (min HA)6 (3 CP + 3 etcd)
ComplexityLowerHigher
Resource isolationSharedDedicated
Failure domainCoupledIndependent

Recommendation: Stacked for most cases. External when etcd performance is critical or you need independent scaling.

Common Errors (Searchable)

code
etcdserver: request timed out

Cause: etcd quorum lost (too many nodes down). Fix: Restore failed nodes or restore from backup.

code
kube-vip: failed to send gratuitous ARP

Cause: kube-vip ARP mode on non-Layer 2 network. Fix: Use BGP mode or external LB.

code
connection refused to <vip>:6443

Cause: VIP not responding. Fix: Check kube-vip pods, verify VIP is on correct node.

code
Unable to connect to the server: dial tcp: lookup <hostname>: no such host

Cause: DNS not resolving LB hostname. Fix: Use IP address or fix DNS.

Common Mistakes

MistakeConsequence
Even number of etcd nodesNo additional fault tolerance
Missing external LB for kubectlCan't reach API from outside cluster
Removing control plane without drainingDisrupted workloads
2-node etcd clusterWorse than 1 node (quorum=2, lose 1 = down)
kube-vip ARP in public cloudVIP never works (no L2 broadcast)