ELB后面的EKS节点退出服务

时间:2019-04-27 12:56:49

标签: amazon-web-services tcp kubernetes amazon-elb amazon-eks

具有一个带有ELB的EKS集群,并附加了3个工作节点。该应用程序正在30590的容器中运行。已在同一端口30590上配置了运行状况检查。Kube-proxy正在侦听此端口。但是工作节点在ELB后面是OutOfService。

  1. 禁用了Worker节点的源,目标检查。
  2. 通过“ echo 0 | sudo tee / proc / sys / net / ipv4 / conf / {all,eth0,eth1,eth2} / rp_filter”禁用rp_filter
  3. 'sudo iptables -vL'的输出:
 pkts bytes target     prot opt in     out     source               destination         
13884  826K KUBE-EXTERNAL-SERVICES  all  --  any    any     anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
2545K 1268M KUBE-FIREWALL  all  --  any    any     anywhere             anywhere            

Chain FORWARD (policy ACCEPT 92 packets, 28670 bytes)
 pkts bytes target     prot opt in     out     source               destination         
1307K  409M KUBE-FORWARD  all  --  any    any     anywhere             anywhere             /* kubernetes forwarding rules */
1301K  409M DOCKER-USER  all  --  any    any     anywhere             anywhere            

Chain OUTPUT (policy ACCEPT 139 packets, 12822 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 349K   21M KUBE-SERVICES  all  --  any    any     anywhere             anywhere             ctstate NEW /* kubernetes service portals */
2443K  222M KUBE-FIREWALL  all  --  any    any     anywhere             anywhere            

Chain DOCKER (0 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (0 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  any    any     anywhere             anywhere            

Chain DOCKER-ISOLATION-STAGE-2 (0 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  any    any     anywhere             anywhere            

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
1301K  409M RETURN     all  --  any    any     anywhere             anywhere            

Chain KUBE-EXTERNAL-SERVICES (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain KUBE-FIREWALL (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  any    any     anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain KUBE-FORWARD (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    3   180 ACCEPT     all  --  any    any     anywhere             anywhere             /* kubernetes forwarding rules */ mark match 0x4000/0x4000

Chain KUBE-SERVICES (1 references)
 pkts bytes target     prot opt in     out     source               destination
  1. 输出:sudo tcpdump -i eth0端口30590
12:41:44.217236 IP ip-192-168-186-107.ec2.internal.22580 > ip-x-x-x-.ec2.internal.30590: Flags [S], seq 3790958206, win 29200, options [mss 1460,sackOK,TS val 10236779 ecr 0,nop,wscale 8], length 0
12:41:44.217834 IP ip-x-x-x-.ec2.internal.30590 > ip-192-168-186-107.ec2.internal.22580: Flags [R.], seq 0, ack 3790958207, win 0, length 0 

看起来像EKS节点正在向ELB发送TCP RST,这就是为什么它们未能通过ELB健康检查的原因。 谁能帮助我解决问题?

1 个答案:

答案 0 :(得分:2)

找到了解决方法:) 问题出在Replicationcontroller.json文件上,我曾提到要公开一个错误的端口,并试图在另一个端口上进行连接。