k8s 1.18.1:自更新至1.18.1后无法访问api

时间:2020-06-23 17:29:39

标签: api networking kubernetes

我已经将集群更新为v1.18.1。某些应用程序具有API访问权限,但是它们返回了API无法访问的错误。 ping命令返回类似的错误。这是两个输出,第一个来自go应用程序,第二个来自针对API的ping命令。

I0623 15:58:57.317985      23 trace.go:201] Trace[163617342]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.18.1/tools/cache/reflector.go:125 (23-Jun-2020 15:58:00.317) (total time: 30000ms):
Trace[163617342]: [30.000517214s] [30.000517214s] END
E0623 15:58:57.318003      23 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.1/tools/cache/reflector.go:125: Failed to list *v1.Service: Get "https://10.96.0.1:443/api/v1/namespaces/default/services?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: i/o timeout
$ kubectl exec -it network-tools -- ping 10.96.0.1
ping: socket: Operation not permitted
command terminated with exit code 2

我可以排除该API基本无法访问的情况。我可以通过kubectl访问它。

作为网络插件,我使用法兰绒。为了安全起见,我重新播放了官方法兰绒YAML,以确保没有可能的更新。但这没有帮助。

现在我只是不知道错误是从哪里来的。为了给支持者一些有关集群的更多信息,这里有一些细节。

Volker

节点

$ kgno -o wide
NAME         STATUS   ROLES    AGE    VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
orbisos001   Ready    master   4h5m   v1.18.1   192.168.179.100   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64        cri-o://1.18.1
orbisos002   Ready    <none>   4h3m   v1.18.1   192.168.179.111   <none>        CentOS Linux 7 (Core)   3.10.0-1127.10.1.el7.x86_64   cri-o://1.18.1

服务

$ kgsv --all-namespaces
NAMESPACE     NAME         TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)                      AGE
default       kubernetes   ClusterIP      10.96.0.1      <none>            443/TCP                      4h10m
default       proxy        LoadBalancer   10.107.54.36   192.168.179.101   80:31414/TCP,443:32154/TCP   3h15m
kube-system   kube-dns     ClusterIP      10.96.0.10     <none>            53/UDP,53/TCP,9153/TCP       4h10m

/etc/cni/net.d/100-crio-bridge.conf

{
    "cniVersion": "0.3.1",
    "name": "crio",
    "type": "bridge",
    "bridge": "cni0",
    "isGateway": true,
    "ipMasq": true,
    "hairpinMode": true,
    "ipam": {
        "type": "host-local",
        "routes": [
            { "dst": "0.0.0.0/0" },
            { "dst": "1100:200::1/24" }
        ],
        "ranges": [
            [{ "subnet": "10.85.0.0/16" }],
            [{ "subnet": "1100:200::/24" }]
        ]
    }
}

RPM版本

$ yum list installed | grep -e kube -e cri-
cri-o.x86_64                       2:1.18.1-1.1.el7                 @crio
cri-tools.x86_64                   1.13.0-1.rhaos4.1.gitc06001f.el7 @tools
kubeadm.x86_64                     1.18.1-0                         @kubernetes
kubectl.x86_64                     1.18.1-0                         @kubernetes
kubelet.x86_64                     1.18.1-0                         @kubernetes

KUBELET_EXTRA_ARGS

$ cat /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice

1 个答案:

答案 0 :(得分:2)

经过长时间的搜索,我找到了解决方案。 kube-system命名空间中的容器kube-flannel-amd64最初会引发错误,即它无权访问iptables。因此,IP数据包不会通过VXLAN路由。这导致了超时错误。

为了使容器能够访问主机系统的iptables,我将官方的kube-flannel.yml从privileged: false更改为true

securityContext:
  privileged: true
  capabilities:
    add: ["NET_ADMIN"]

重新部署YAML文件后,规则已成功创建:

I0625 08:53:13.166567       1 vxlan_network.go:60] watching for new subnet leases
I0625 08:53:13.168489       1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0625 08:53:13.168501       1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0625 08:53:13.169149       1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0625 08:53:13.170435       1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
I0625 08:53:13.171086       1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
I0625 08:53:13.262424       1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0625 08:53:13.263101       1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0625 08:53:13.263109       1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0625 08:53:13.264018       1 iptables.go:167] Deleting iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0625 08:53:13.264195       1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0625 08:53:13.264883       1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0625 08:53:13.267062       1 iptables.go:155] Adding iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0625 08:53:13.267781       1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
I0625 08:53:13.363094       1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully

我的应用程序现在可以连接kubernetes API。