使用Flannel将节点添加到群集:“无法加入未运行容器的网络”

时间:2019-08-25 11:26:28

标签: kubernetes kubeadm flannel

我正在使用法兰绒将节点添加为Kubernetes集群中的一个节点。这是我的集群上的节点: kubectl get nodes

NAME              STATUS     ROLES    AGE    VERSION
jetson-80         NotReady   <none>   167m   v1.15.0
p4                Ready      master   18d    v1.15.0

可以通过同一网络访问本机。加入集群时,Kubernetes会拉一些图像,其中包括k8s.gcr.io/pause:3.1,但是由于某种原因未能拉出图像:

Warning  FailedCreatePodSandBox  15d                 
kubelet,jetson-81  Failed create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Error response from daemon: Get https://k8s.gcr.io/v2/: read tcp 192.168.8.81:58820->108.177.126.82:443: read: connection reset by peer

机器已连接到Internet,但仅wget命令有效,ping

我试图将图像拖到其他地方,然后将其复制到计算机上。

REPOSITORY                                               TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                                    v1.15.0             d235b23c3570        2 months ago        82.4MB
quay.io/coreos/flannel                                   v0.11.0-arm64       32ffa9fadfd7        6 months ago        53.5MB
k8s.gcr.io/pause                                         3.1                 da86e6ba6ca1        20 months ago       742kB

以下是主机上的Pod列表:

NAME                              READY   STATUS                  RESTARTS   AGE
coredns-5c98db65d4-gmsz7          1/1     Running                 0          2d22h
coredns-5c98db65d4-j6gz5          1/1     Running                 0          2d22h
etcd-p4                           1/1     Running                 0          2d22h
kube-apiserver-p4                 1/1     Running                 0          2d22h
kube-controller-manager-p4        1/1     Running                 0          2d22h
kube-flannel-ds-amd64-cq7kz       1/1     Running                 9          17d
kube-flannel-ds-arm64-4s7kk       0/1     Init:CrashLoopBackOff   0          2m8s
kube-proxy-l2slz                  0/1     CrashLoopBackOff        4          2m8s
kube-proxy-q6db8                  1/1     Running                 0          2d22h
kube-scheduler-p4                 1/1     Running                 0          2d22h
tiller-deploy-5d6cc99fc-rwdrl     1/1     Running                 1          17d

但是当我检查关联的flannel吊舱kube-flannel-ds-arm64-4s7kk时,它也没有起作用:

  Type     Reason          Age                            From                      Message
  ----     ------          ----                           ----                      -------
  Normal   Scheduled       66s                            default-scheduler         Successfully assigned kube-system/kube-flannel-ds-arm64-4s7kk to jetson-80
  Warning  Failed          <invalid>                      kubelet, jetson-80        Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 68ffc44cf8cd655234691b0362615f97c59d285bec790af40f890510f27ba298
  Warning  Failed          <invalid>                      kubelet, jetson-80        Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: a196d8540b68dc7fcd97b0cda1e2f3183d1410598b6151c191b43602ac2faf8e
  Warning  Failed          <invalid>                      kubelet, jetson-80        Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 9d05d1fcb54f5388ca7e64d1b6627b05d52aea270114b5a418e8911650893bc6
  Warning  Failed          <invalid>                      kubelet, jetson-80        Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 5b730961cddf5cc3fb2af564b1abb46b086073d562bb2023018cd66fc5e96ce7
  Normal   Created         <invalid> (x5 over <invalid>)  kubelet, jetson-80        Created container install-cni
  Warning  Failed          <invalid>                      kubelet, jetson-80        Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 1767e9eb9198969329eaa14a71a110212d6622a8b9844137ac5b247cb9e90292
  Normal   SandboxChanged  <invalid> (x5 over <invalid>)  kubelet, jetson-80        Pod sandbox changed, it will be killed and re-created.
  Warning  BackOff         <invalid> (x4 over <invalid>)  kubelet, jetson-80        Back-off restarting failed container
  Normal   Pulled          <invalid> (x6 over <invalid>)  kubelet, jetson-80        Container image "quay.io/coreos/flannel:v0.11.0-arm64" already present on machine

我仍然无法确定这是Kubernetes还是Flannel问题,尽管进行了多次尝试,但仍无法解决。如果您需要我分享更多详细信息,请告诉我

编辑

使用kubectl describe pod -n kube-system kube-proxy-l2slz

  Normal   Pulled          <invalid> (x67 over <invalid>)    kubelet, ahold-jetson-80  Container image "k8s.gcr.io/kube-proxy:v1.15.0" already present on machine
  Normal   SandboxChanged  <invalid> (x6910 over <invalid>)  kubelet, ahold-jetson-80  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedSync      <invalid> (x77 over <invalid>)    kubelet, ahold-jetson-80  (combined from similar events): error determining status: rpc error: code = Unknown desc = Error: No such container: 03e7ee861f8f63261ff9289ed2d73ea5fec516068daa0f1fe2e4fd50ca42ad12
  Warning  BackOff         <invalid> (x8437 over <invalid>)  kubelet, ahold-jetson-80  Back-off restarting failed container

1 个答案:

答案 0 :(得分:0)

您的问题可能是由节点中的mutil沙箱容器引起的。尝试重新启动kubelet:

$ systemctl restart kubelet

检查是否已生成公钥并将其复制到右节点以使其之间具有连接:ssh-keygen

请确保防火墙/安全组允许UDP端口58820上的通信。 查看绒布日志,看看那里是否有任何错误,还要查找“ Subnet included:”消息。每个节点都应该添加另外两个子网。

在运行ping时,请尝试使用 tcpdump 来查看丢包的位置。

尝试src flannel0(icmp),src主机接口(udp端口58820),dest主机接口(udp端口58820),dest flannel0(icmp),docker0(icmp)。

以下是有用的文档:flannel-documentation