我在GCP上用kubeadm(1个主节点+ 2个工作人员)创建了一个2节点k8s集群,除点对点通信外,一切似乎都很好。
因此,首先,集群中没有可见的问题。所有Pod正在运行。没有错误,没有rushloopbackoffoffs,没有挂起的豆荚。
我强制进行以下测试:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default bb-9bd94cf6f-b5cj5 1/1 Running 1 19h 192.168.2.3 worker-node-1
default curler-7668c66bf5-6c6v8 1/1 Running 1 20h 192.168.2.2 worker-node-1
default curler-master-5b86858f9f-c6zhq 1/1 Running 0 18h 192.168.0.6 master-node
default nginx-5c7588df-x42vt 1/1 Running 0 19h 192.168.2.4 worker-node-1
default nginy-6d77947646-4r4rl 1/1 Running 0 20h 192.168.1.4 worker-node-2
kube-system calico-node-9v98k 2/2 Running 0 97m 10.240.0.7 master-node
kube-system calico-node-h2px8 2/2 Running 0 97m 10.240.0.9 worker-node-2
kube-system calico-node-qjn5t 2/2 Running 0 97m 10.240.0.8 worker-node-1
kube-system coredns-86c58d9df4-gckhl 1/1 Running 0 97m 192.168.1.9 worker-node-2
kube-system coredns-86c58d9df4-wvt2n 1/1 Running 0 97m 192.168.2.6 worker-node-1
kube-system etcd-master-node 1/1 Running 0 97m 10.240.0.7 master-node
kube-system kube-apiserver-master-node 1/1 Running 0 97m 10.240.0.7 master-node
kube-system kube-controller-manager-master-node 1/1 Running 0 97m 10.240.0.7 master-node
kube-system kube-proxy-2g85h 1/1 Running 0 97m 10.240.0.8 worker-node-1
kube-system kube-proxy-77pq4 1/1 Running 0 97m 10.240.0.9 worker-node-2
kube-system kube-proxy-bbd2d 1/1 Running 0 97m 10.240.0.7 master-node
kube-system kube-scheduler-master-node 1/1 Running 0 97m 10.240.0.7 master-node
这些是服务:
$ kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21h
default nginx ClusterIP 10.109.136.120 <none> 80/TCP 20h
default nginy NodePort 10.101.111.222 <none> 80:30066/TCP 20h
kube-system calico-typha ClusterIP 10.111.238.0 <none> 5473/TCP 21h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 21h
nginx
和nginy
服务指向nginx-xxx
和nginy-xxx
容器,并且正在运行nginx
,curlers
是具有卷曲和ping。其中一个在主节点上运行,另一个在worker-node-1上运行。如果我访问在worker-node-1(curler-7668c66bf5-6c6v8)上运行的curler pod,并访问同一节点上的curl
nginx pod,它可以正常工作。
$ kubectl exec -it curler-7668c66bf5-6c6v8 sh
/ # curl 192.168.2.4 -I
HTTP/1.1 200 OK
Server: nginx/1.15.12
Date: Tue, 07 May 2019 10:59:06 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 16 Apr 2019 13:08:19 GMT
Connection: keep-alive
ETag: "5cb5d3c3-264"
Accept-Ranges: bytes
如果我通过服务名称尝试相同的操作,则在coredns
运行时,它可以正常工作;一个在worker节点1上,另一个在worker节点2上。我相信,如果请求发送到在worker-node-1上运行的coredns pod,它会起作用,但是当它发送到worker-node-2时,它不会起作用。
/ # curl nginx -I
curl: (6) Could not resolve host: nginx
/ # curl nginx -I
HTTP/1.1 200 OK
Server: nginx/1.15.12
Date: Tue, 07 May 2019 11:06:13 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 16 Apr 2019 13:08:19 GMT
Connection: keep-alive
ETag: "5cb5d3c3-264"
Accept-Ranges: bytes
因此,我的Pod到Pod的通信肯定不起作用。我检查了calico守护程序吊舱的日志,但没有可疑之处。我在kube-proxy
窗格中确实有一些可疑的日志:
$ kubectl logs kube-proxy-77pq4 -n kube-system
W0507 09:16:51.305357 1 server_others.go:295] Flag proxy-mode="" unknown, assuming iptables proxy
I0507 09:16:51.315528 1 server_others.go:148] Using iptables Proxier.
I0507 09:16:51.315775 1 server_others.go:178] Tearing down inactive rules.
E0507 09:16:51.356243 1 proxier.go:563] Error removing iptables rules in ipvs proxier: error deleting chain "KUBE-MARK-MASQ": exit status 1: iptables: Too many links.
I0507 09:16:51.648112 1 server.go:464] Version: v1.13.1
I0507 09:16:51.658690 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0507 09:16:51.659034 1 config.go:102] Starting endpoints config controller
I0507 09:16:51.659052 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I0507 09:16:51.659076 1 config.go:202] Starting service config controller
I0507 09:16:51.659083 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0507 09:16:51.759278 1 controller_utils.go:1034] Caches are synced for endpoints config controller
I0507 09:16:51.759291 1 controller_utils.go:1034] Caches are synced for service config controller
有人可以告诉我问题是否可能是由于iptables的kube-proxy
配置错误吗?还是指出我想念的东西?
答案 0 :(得分:0)
原始海报已解决该问题,并提供以下解决方案:
问题是我必须在防火墙的IP通信中打开IP GCP中的规则。现在可以了