我的EKS群集运行不正常,并从所有吊舱中出现“ ContainerCreating”错误,这可能与CNI问题有关。
一旦我启动了新的节点工作程序,它们就没有进入“就绪”状态并提示以下错误:
"couldn't get current server API group list; will keep using cached value. (Get https://172.20.0.1:443/api?timeout=32s: dial tcp
172.20.0.1:443: i/o timeout) Failed to communicate with K8S Server. Please check instance security groups or http proxy setting"
我没有使用http代理,并且私有CIDR允许使用安全组(从端口443 Telnet到API服务器正在工作)。
我的CNI版本是1.5.5,根据一些有关此问题的线索,我试图将CNI降级为1.5.3-节点仍未连接,并降为1.5.1-节点已连接为/ etc / cni / net.d / 10-aws.conflist文件存在,但pod无法设法连接到它们。
在1.5.5版中,conflist文件的位置已更改为/etc/cni/10-aws.conflist,但是节点仍处于“未就绪”状态。
我的EKS版本是1.14,平台版本是eks.2。
Ipamd日志:
2019-11-27T09:09:13.446Z [INFO] Starting L-IPAMD v1.5.5 ...
2019-11-27T09:09:43.447Z [INFO] Testing communication with server
2019-11-27T09:10:13.448Z [INFO] Failed to communicate with K8S Server. Please check instance security groups or http proxy setting
2019-11-27T09:10:13.448Z [ERROR] Failed to create client: error communicating with apiserver: Get https://172.20.0.1:443/version?timeout=32s: dial tcp 172.20.0.1:443: i/o timeout
来自容器的错误是:
Warning FailedCreatePodSandBox 17m kubelet, ip-10-1-1-144.eu-west-1.compute.internal Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "b02f175d5e68011332655e0d6e6aa3ae226bbd7bf447c7461c0140a7e026d831" network for pod "coredns-759d6fc95f-zx292": NetworkPlugin cni failed to set up pod "coredns-759d6fc95f-zx292_kube-system" network: failed to find plugin "aws-cni" in path [/opt/cni/bin], failed to clean up sandbox container "b02f175d5e68011332655e0d6e6aa3ae226bbd7bf447c7461c0140a7e026d831" network for pod "coredns-759d6fc95f-zx292": NetworkPlugin cni failed to teardown pod "coredns-759d6fc95f-zx292_kube-system" network: failed to find plugin "aws-cni" in path [/opt/cni/bin]]
Normal SandboxChanged 2m47s (x70 over 17m) kubelet, ip-10-1-1-144.eu-west-1.compute.internal Pod sandbox changed, it will be killed and re-created.
CNI图片:602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.5.5
/opt/cni/bin/aws-cni-support.sh脚本输出: /opt/cni/bin/aws-cni-support.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61679: Connection refused
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61679: Connection refused
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61679: Connection refused
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61679: Connection refused
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61679: Connection refused
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 61678: Connection refused
tar: Removing leading `/' from member names
/var/log/aws-routed-eni/
/var/log/aws-routed-eni/ipamd.log.2019-11-27-09
/var/log/aws-routed-eni/ipamd.log.2019-11-27-10
/var/log/aws-routed-eni/eni.out
/var/log/aws-routed-eni/pod.out
/var/log/aws-routed-eni/networkutils-env.out
/var/log/aws-routed-eni/ipamd-env.out
/var/log/aws-routed-eni/eni-configs.out
/var/log/aws-routed-eni/metrics.out
/var/log/aws-routed-eni/ifconfig.out
/var/log/aws-routed-eni/iprule.out
/var/log/aws-routed-eni/iptables-save.out
/var/log/aws-routed-eni/iptables.out
/var/log/aws-routed-eni/iptables-nat.out
/var/log/aws-routed-eni/iptables-mangle.out
/var/log/aws-routed-eni/cni/
/var/log/aws-routed-eni/cni/10-aws.conflist
/var/log/aws-routed-eni/messages
/var/log/aws-routed-eni/route.out
/var/log/aws-routed-eni/sysctls.out
此外,/ var / log / aws-routed-eni / messages中还会出现以下许多错误: 网络:无法在路径[/ opt / cni / bin]中找到插件\“ aws-cni \”“
没有/ opt / cni / bin / aws-cni文件。
有人对这个问题可能有什么线索吗?
答案 0 :(得分:0)
我遇到了同样的问题,问题出在kube-proxy。
看,aws-cni插件实际上是由aws-node吊舱下载的,因此,如果它们无法连接到主服务器,则不会发生,因此缺少配置文件和二进制文件。
对我来说,解决该问题的是修复了kube-proxy配置(由于现在不支持的标志--resource-container
,这是错误的)。这可能不是您遇到的问题,但是我绝对会检查kube-proxies并查看日志中是否有任何问题。
这些值无法通过kubectl logs ...
获得,但存储在节点上的/var/log/kube-proxy.log
中。