我正在使用法兰绒将节点添加为Kubernetes集群中的一个节点。这是我的集群上的节点:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
jetson-80 NotReady <none> 167m v1.15.0
p4 Ready master 18d v1.15.0
可以通过同一网络访问本机。加入集群时,Kubernetes会拉一些图像,其中包括k8s.gcr.io/pause:3.1,但是由于某种原因未能拉出图像:
Warning FailedCreatePodSandBox 15d
kubelet,jetson-81 Failed create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Error response from daemon: Get https://k8s.gcr.io/v2/: read tcp 192.168.8.81:58820->108.177.126.82:443: read: connection reset by peer
机器已连接到Internet,但仅wget
命令有效,ping
我试图将图像拖到其他地方,然后将其复制到计算机上。
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.15.0 d235b23c3570 2 months ago 82.4MB
quay.io/coreos/flannel v0.11.0-arm64 32ffa9fadfd7 6 months ago 53.5MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
以下是主机上的Pod列表:
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-gmsz7 1/1 Running 0 2d22h
coredns-5c98db65d4-j6gz5 1/1 Running 0 2d22h
etcd-p4 1/1 Running 0 2d22h
kube-apiserver-p4 1/1 Running 0 2d22h
kube-controller-manager-p4 1/1 Running 0 2d22h
kube-flannel-ds-amd64-cq7kz 1/1 Running 9 17d
kube-flannel-ds-arm64-4s7kk 0/1 Init:CrashLoopBackOff 0 2m8s
kube-proxy-l2slz 0/1 CrashLoopBackOff 4 2m8s
kube-proxy-q6db8 1/1 Running 0 2d22h
kube-scheduler-p4 1/1 Running 0 2d22h
tiller-deploy-5d6cc99fc-rwdrl 1/1 Running 1 17d
但是当我检查关联的flannel
吊舱kube-flannel-ds-arm64-4s7kk
时,它也没有起作用:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 66s default-scheduler Successfully assigned kube-system/kube-flannel-ds-arm64-4s7kk to jetson-80
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 68ffc44cf8cd655234691b0362615f97c59d285bec790af40f890510f27ba298
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: a196d8540b68dc7fcd97b0cda1e2f3183d1410598b6151c191b43602ac2faf8e
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 9d05d1fcb54f5388ca7e64d1b6627b05d52aea270114b5a418e8911650893bc6
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 5b730961cddf5cc3fb2af564b1abb46b086073d562bb2023018cd66fc5e96ce7
Normal Created <invalid> (x5 over <invalid>) kubelet, jetson-80 Created container install-cni
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 1767e9eb9198969329eaa14a71a110212d6622a8b9844137ac5b247cb9e90292
Normal SandboxChanged <invalid> (x5 over <invalid>) kubelet, jetson-80 Pod sandbox changed, it will be killed and re-created.
Warning BackOff <invalid> (x4 over <invalid>) kubelet, jetson-80 Back-off restarting failed container
Normal Pulled <invalid> (x6 over <invalid>) kubelet, jetson-80 Container image "quay.io/coreos/flannel:v0.11.0-arm64" already present on machine
我仍然无法确定这是Kubernetes还是Flannel问题,尽管进行了多次尝试,但仍无法解决。如果您需要我分享更多详细信息,请告诉我
编辑:
使用kubectl describe pod -n kube-system kube-proxy-l2slz
:
Normal Pulled <invalid> (x67 over <invalid>) kubelet, ahold-jetson-80 Container image "k8s.gcr.io/kube-proxy:v1.15.0" already present on machine
Normal SandboxChanged <invalid> (x6910 over <invalid>) kubelet, ahold-jetson-80 Pod sandbox changed, it will be killed and re-created.
Warning FailedSync <invalid> (x77 over <invalid>) kubelet, ahold-jetson-80 (combined from similar events): error determining status: rpc error: code = Unknown desc = Error: No such container: 03e7ee861f8f63261ff9289ed2d73ea5fec516068daa0f1fe2e4fd50ca42ad12
Warning BackOff <invalid> (x8437 over <invalid>) kubelet, ahold-jetson-80 Back-off restarting failed container
答案 0 :(得分:0)
您的问题可能是由节点中的mutil沙箱容器引起的。尝试重新启动kubelet:
$ systemctl restart kubelet
检查是否已生成公钥并将其复制到右节点以使其之间具有连接:ssh-keygen。
请确保防火墙/安全组允许UDP端口58820上的通信。 查看绒布日志,看看那里是否有任何错误,还要查找“ Subnet included:”消息。每个节点都应该添加另外两个子网。
在运行ping时,请尝试使用 tcpdump 来查看丢包的位置。
尝试src flannel0(icmp),src主机接口(udp端口58820),dest主机接口(udp端口58820),dest flannel0(icmp),docker0(icmp)。
以下是有用的文档:flannel-documentation。