由于节点的无效范围ip,Cilium容器崩溃

时间:2018-12-14 15:15:33

标签: kubernetes cilium

我正在使用 kubespray 部署 kubernetes 集群。 我将网络插件从calico更改为 cilium

不幸的是,一些纤毛荚卡在了CrashLoopBackOff中。

kubectl --namespace kube-system get pods --selector k8s-app=cilium --sort-by='.status.containerStatuses[0].restartCount' -o wide
NAME           READY   STATUS             RESTARTS   AGE   IP            NODE          NOMINATED NODE   READINESS GATES
cilium-2gmwm   1/1     Running            0          14m   10.10.3.102   nodemaster1   <none>           <none>
cilium-9ccdp   1/1     Running            0          14m   10.10.3.110   node6         <none>           <none>
cilium-c9nh6   1/1     Running            0          14m   10.10.3.107   node3         <none>           <none>
cilium-r9w4z   0/1     CrashLoopBackOff   6          14m   10.10.3.109   node5         <none>           <none>
cilium-f8z2q   1/1     Running            0          14m   10.10.3.105   node1         <none>           <none>
cilium-d96cd   0/1     CrashLoopBackOff   7          14m   10.10.3.106   node2         <none>           <none>
cilium-jgmcf   0/1     CrashLoopBackOff   7          14m   10.10.3.103   nodemaster2   <none>           <none>
cilium-9zqnr   0/1     CrashLoopBackOff   7          14m   10.10.3.108   node4         <none>           <none>
cilium-llt9p   0/1     CrashLoopBackOff   7          14m   10.10.3.104   nodemaster3   <none>           <none>

当检查崩溃的吊舱的日志时,我可以看到此致命错误消息:

level=fatal msg="The allocation CIDR is different from the previous cilium instance. This error is most likely caused by a temporary network disruption to the kube-apiserver that prevent Cilium from retrieve the node's IPv4/IPv6 allocation range. If you believe the allocation range is supposed to be different you need to clean up all Cilium state with the `cilium cleanup` command on this node. Be aware this will cause network disruption for all existing containers managed by Cilium running on this node and you will have to restart them." error="Unable to allocate internal IPv4 node IP 10.233.71.1: provided IP is not in the valid range. The range of valid IPs is 10.233.70.0/24." subsys=daemon

似乎节点的IP(在这种情况下为 10.233.71.1 )不符合 10.233.70.0/24 的有效范围。

我试图修改kubespray的main.yaml文件以更改子网,但是我的多次尝试仅使崩溃次数上升或下降...

例如,我尝试使用此运行方式:

kube_service_addresses: 10.233.0.0/17
kube_pods_subnet: 10.233.128.0/17
kube_network_node_prefix: 18

如您所见,它不起作用。 如果您有任何想法... :-)
谢谢

1 个答案:

答案 0 :(得分:0)

我终于在Cilium开发人员的帮助下解决了这个问题!

您必须在kubespray文件clean-cilium-state

中将密钥kubespray/roles/network_plugin/cilium/templates/cilium-config.yml.j2 false 设置为 true

部署后,您必须还原此布尔值。为此,执行kubectl edit configmap cilium-config -n kube-system,然后将密钥clean-cilium-state true 更改为 false

最后,您必须杀死纤毛荚。
列出豆荚:kubectl get pods -n kube-system
杀死豆荚:kubectl delete pods cilium-xxx cilium-xxx ...

它现在在Cilium仓库中列为issue