我的本地计算机kubernetes集群昨天运行良好,直到我安装了一些组件,slave1和slave2各自只有4G,并且我检查可用内存只有100MB +,然后停止VM并将KVM虚拟机内存增加到8GB。然后重新检查可用内存,以确保每个节点有2GB以上的可用内存。现在slave1和slave2节点运行不正常,这是节点状态:
[root@k8smaster ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8smaster Ready master 12d v1.18.5 192.168.31.29 <none> CentOS Linux 8 (Core) 4.18.0-193.6.3.el8_2.x86_64 docker://19.3.12
k8sslave1 NotReady <none> 12d v1.18.5 192.168.31.30 <none> CentOS Linux 8 (Core) 4.18.0-193.6.3.el8_2.x86_64 docker://19.3.12
k8sslave2 NotReady <none> 12d v1.18.5 192.168.31.31 <none> CentOS Linux 8 (Core) 4.18.0-193.6.3.el8_2.x86_64 docker://19.3.12
然后我检查其中一个节点状态,它看起来像这样:
[root@k8smaster ~]# kubectl describe node k8sslave1
Name: k8sslave1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8sslave1
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 192.168.31.30/24
projectcalico.org/IPv4IPIPTunnelAddr: 10.11.157.64
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 13 Jul 2020 11:50:48 -0400
Taints: node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: k8sslave1
AcquireTime: <unset>
RenewTime: Sat, 25 Jul 2020 09:47:24 -0400
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sat, 25 Jul 2020 00:48:55 -0400 Sat, 25 Jul 2020 00:48:55 -0400 CalicoIsUp Calico is running on this node
MemoryPressure Unknown Sat, 25 Jul 2020 09:43:45 -0400 Sat, 25 Jul 2020 09:48:07 -0400 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Sat, 25 Jul 2020 09:43:45 -0400 Sat, 25 Jul 2020 09:48:07 -0400 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Sat, 25 Jul 2020 09:43:45 -0400 Sat, 25 Jul 2020 09:48:07 -0400 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Sat, 25 Jul 2020 09:43:45 -0400 Sat, 25 Jul 2020 09:48:07 -0400 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.31.30
Hostname: k8sslave1
Capacity:
cpu: 2
ephemeral-storage: 36702712Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 4311228Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 33825219324
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 4208828Ki
pods: 110
System Info:
Machine ID: 0c9c1291618645498e63ddfe3895658a
System UUID: b25d27cf-4dea-44fe-96d8-a75e0c138187
Boot ID: 3290a714-0e18-47dd-a811-2dd16c8a17c9
Kernel Version: 4.18.0-193.6.3.el8_2.x86_64
OS Image: CentOS Linux 8 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.12
Kubelet Version: v1.18.5
Kube-Proxy Version: v1.18.5
PodCIDR: 10.11.1.0/24
PodCIDRs: 10.11.1.0/24
Non-terminated Pods: (16 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default apm-server-filebeat-dn48j 100m (5%) 1 (50%) 100Mi (2%) 200Mi (4%) 15h
default traefik-88f7c94bf-tdz2m 0 (0%) 0 (0%) 0 (0%) 0 (0%) 8d
infrastructure elasticsearch-elasticsearch-coordinating-only-7744945d6d-zwz7z 25m (1%) 0 (0%) 256Mi (6%) 0 (0%) 15h
infrastructure harbor-harbor-chartmuseum-575cdf84f6-2m5t5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d14h
infrastructure harbor-harbor-clair-6464c85c99-zb997 0 (0%) 0 (0%) 0 (0%) 0 (0%) 35h
infrastructure harbor-harbor-notary-signer-5d9b779f54-fwzl8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 36h
infrastructure harbor-harbor-portal-59c779dd74-lj5zl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d14h
infrastructure harbor-harbor-registry-6ffb84b667-cvxwq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d14h
infrastructure harbor-harbor-trivy-0 200m (10%) 1 (50%) 512Mi (12%) 1Gi (24%) 36h
infrastructure jenkins-845bd5bcd4-4mkqn 50m (2%) 2 (100%) 256Mi (6%) 4Gi (99%) 4d14h
kube-system calico-kube-controllers-75d555c48-wd84b 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system calico-node-2sj6v 250m (12%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system coredns-676d976fcb-bxzcs 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 9d
kube-system kube-proxy-f4lg4 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
monitoring prometheus-1595085197-node-exporter-ztgd4 0 (0%) 0 (0%) 0 (0%) 0 (0%) 7d13h
monitoring prometheus-1595085197-server-57967bb676-ksl2k 0 (0%) 0 (0%) 0 (0%) 0 (0%) 7d13h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 725m (36%) 4 (200%)
memory 1194Mi (29%) 5490Mi (133%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
它告诉我kubelet没有发送状态信息,然后我检查slave1节点kubelet状态:
[root@k8sslave1 ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2020-07-26 00:30:45 EDT; 8min ago
Docs: https://kubernetes.io/docs/
Main PID: 7192 (kubelet)
Tasks: 17 (limit: 49628)
Memory: 41.5M
CGroup: /system.slice/kubelet.service
└─7192 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image>
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.625224 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.725575 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.825956 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.929822 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.030028 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.130344 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.230562 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.330896 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.431111 7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.531472 7192 kubelet.go:2267] node "k8sslave1" not found
该过程运行正常,但请给我提示:node "k8sslave1" not found
。为什么给我这个提示?我该怎么解决?
答案 0 :(得分:1)
您在使用kubeadm吗?如果您正在使用kubeadm;您可以按照以下步骤操作:
删除从属节点
kubecl delete node k8sslave1
在从节点的节点上,执行:
kubeadm reset
然后您需要将从属节点加入集群,在主节点中执行:
token=$(kubeadm token generate)
kubeadm token create $token --ttl 2h --print-join-command
在从属节点中粘贴命令的输出。
kubectl join ...
查看节点是否已加入集群,并且新状态为“ 就绪”。
ubuntu@kube-master:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master Ready master 20d v1.18.1
kube-worker-1 Ready <none> 20d v1.18.1
kube-worker-2 Ready <none> 12m v1.18.1
我希望它对您有用。 :)