Kubernetes:无法跨节点ping Pod

时间:2018-07-30 08:28:01

标签: networking amazon-ec2 routing kubernetes

我目前正在关注本教程:https://github.com/kelseyhightower/kubernetes-the-hard-way(除了我在AWS上,而且对此我无能为力。)。
我目前处于第10步(https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/11-pod-network-routes.md),在尝试从一个工人到另一个工人的吊舱接触时似乎遇到问题。

这是来自两个工作人员的日志,有助于强调该问题:

worker-0

root@worker-0:/home/admin# ip addr show eth0 | grep 'inet '                                                                                                                                                        
inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
root@worker-0:/home/admin# traceroute 10.200.1.10 -n -i cnio0 -I -m 5                                                                                                                                              
traceroute to 10.200.1.10 (10.200.1.10), 5 hops max, 60 byte packets
 1  10.200.1.10  0.135 ms  0.079 ms  0.073 ms
root@worker-0:/home/admin# ping 10.240.1.232
PING 10.240.1.232 (10.240.1.232) 56(84) bytes of data.
64 bytes from 10.240.1.232: icmp_seq=1 ttl=64 time=0.151 ms
^C
--- 10.240.1.232 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
root@worker-0:/home/admin# traceroute 10.200.3.5 -g 10.240.1.232 -n -i eth0 -I -m 5                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 72 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
root@worker-0:/home/admin#

worker-2

root@worker-2:/home/admin# ip addr show eth0 | grep 'inet '
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
root@worker-2:/home/admin# traceroute 10.200.3.5 -n -i cnio0 -I -m 5                                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 60 byte packets
 1  10.200.3.5  0.140 ms  0.077 ms  0.072 ms
root@worker-2:/home/admin# ping 10.200.3.5
PING 10.200.3.5 (10.200.3.5) 56(84) bytes of data.
64 bytes from 10.200.3.5: icmp_seq=1 ttl=64 time=0.059 ms
64 bytes from 10.200.3.5: icmp_seq=2 ttl=64 time=0.047 ms
^C
--- 10.200.3.5 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.047/0.053/0.059/0.006 ms
root@worker-2:/home/admin#

pod正确部署(我尝试生成11个busybox实例,结果如下:

admin@ip-10-240-1-250:~$ kubectl get pods
busybox-68654f944b-vjs2s    1/1       Running     69         2d
busybox0-7665ddff5d-2856g   1/1       Running     69         2d
busybox1-f9585ffdb-tg2lj    1/1       Running     68         2d
busybox2-78c5d7bdb6-fhfdc   1/1       Running     68         2d
busybox3-74fd4b4f98-pp4kz   1/1       Running     69         2d
busybox4-55d568f8c4-q9hk9   1/1       Running     68         2d
busybox5-69f77b4fdb-d7jf2   1/1       Running     68         2d
busybox6-b5b869f4-2vnkz     1/1       Running     69         2d
busybox7-7df7958c4b-4bxzx   0/1       Completed   68         2d
busybox8-6d78f4f5d6-cvfx7   1/1       Running     69         2d
busybox9-86d49fdf4-75ddn    1/1       Running     68         2d

感谢您的见解

编辑:为工作人员添加信息

worker-0

root@worker-0:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:2b:ed:df:b7:58 brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::2b:edff:fedf:b758/64 scope link
       valid_lft forever preferred_lft forever
root@worker-0:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.1.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

worker-2

root@worker-2:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:b0:2b:67:73:9e brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b0:2bff:fe67:739e/64 scope link
       valid_lft forever preferred_lft forever
root@worker-2:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.3.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

2 个答案:

答案 0 :(得分:1)

您的节点缺少到其他节点容器的子网的路由。

要使其正常运行,您需要在工作节点上添加静态路由,或者向默认网关10.240.1.1上所有pod的子网添加路由

第一种情况:

worker1 节点上运行:

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232

worker2 节点上运行:

route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

在这种情况下,流量将直接从一个工作程序节点流向另一个工作节点,但是如果您的集群增长,则必须相应地更改所有工作程序上的路由表。 但是,如果不将IP路由添加到云路由器,则其他VPC主机将无法访问这些子网。

第二种情况:

默认路由器10.240.1.1)上:

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232
route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

在这种情况下,流量将通过默认路由器进行路由,并且如果将新节点添加到群集中,则只需更新默认路由器上的一个路由表。
此解决方案在“艰难的方式”的Routes part中使用。

article有助于使用AWS CLI创建路由。

答案 1 :(得分:0)

感谢@VAS,它很有帮助,

kubernet管理员

# edit /etc/hosts

192.168.2.150 master master.localdomain
192.168.2.151 node1 node1.localdomain
192.168.2.152 node2 node2.localdomain
...

# then add routes
$ route add -net 10.244.1.0/24 gw node1
$ route add -net 10.244.2.0/24 gw node2
...

那是因为

“ .. flannel为每个主机提供一个IP子网(默认情况下为/ 24)。”

Flannel: A Network Fabric for Containers