当将节点加入集群时,主节点的 kube-scheduler-k8scp-01 发生重启。日志如下:
I0408 10:23:44.110342 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I0408 10:23:44.110368 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I0408 10:23:44.110404 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I0408 10:23:44.110424 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I0408 10:23:44.209318 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I0408 10:23:44.209532 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler... I0408 10:23:44.210501 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I0408 10:23:44.210578 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I0408 10:24:01.121601 1 leaderelection.go:253] successfully acquired lease kube-system/kube-scheduler E0408 10:24:36.431925 1 leaderelection.go:361] Failed to update lock: etcdserver: request timed out E0408 10:24:39.410935 1 leaderelection.go:325] error retrieving resource lock kube-system/kube-scheduler: Get "https://172.31.253.61:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler?timeout=10s": context deadline exceeded I0408 10:24:39.411094 1 leaderelection.go:278] failed to renew lease kube-system/kube-scheduler: timed out waiting for the condition F0408 10:24:39.411390 1 server.go:205] leaderelection lost goroutine 1 [running]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc00000e001, 0xc00059c540, 0x41, 0xd5) /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9 k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x2dc7a40, 0xc000000003, 0x0, 0x0, 0xc0004c1c00, 0x2ce5b18, 0x9, 0xcd, 0x0) /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:975 +0x19b k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).printf(0x2dc7a40, 0x3, 0x0, 0x0, 0x0, 0x0, 0x1da20c9, 0x13, 0x0, 0x0, ...) /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:750 +0x191 k8s.io/kubernetes/vendor/k8s.io/klog/v2.Fatalf(...) /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1502 k8s.io/kubernetes/cmd/kube-scheduler/app.Run.func3() /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:205 +0x8f k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1(0xc000768a20) /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:199 +0x29 k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc000768a20, 0x2018040, 0xc00067d300)
从日志中,我们发现是 etcd 服务无法访问,而导致的失败。