安装OpenShift后,Docker容器中的网络速度变慢

时间:2016-07-04 12:29:36

标签: networking docker openshift bandwidth

使用CentOS'在Google Cloud(n1-standard-2)中安装CentOS 7 VM后自己的Docker包(docker-1.10.3-44)网络性能如预期。 使用wget从Docker容器中下载100MB文件需要18秒。

但是,如果同一节点是OpenShift Origin(1.1)安装的一部分(在适用的情况下使用默认网络设置),则运行OpenShift Origin Ansible脚本后,Docker容器的网络性能会大幅下降,从下载100MB文件以上现在需要一个多小时!这会影响OpenShift集群的主成员和节点成员。

这仅影响容器的带宽,CentOS主机的网络性能不受影响。它也只影响吞吐量。容器的Ping时间也不受影响。

我已多次重现这种行为。然而,我只能在Google Cloud上设置OpenShift集群(尝试了AWS安装,但由于无关原因而失败)。

我是OpenShift的新手,花了相当多的钱寻找这个问题的在线帮助,但到目前为止没有运气。

更新:

在容器的接口(eth0)上开始的HTTP连接开始的tcpdump产生以下结果:

17:03:19.088888 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [S],
seq 1295390892, win 27400, options [mss 1370,sackOK,TS val 23947380
ecr 0,nop,wscale 7], length 0

17:03:19.106356 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [S.],
seq 403807014, ack 1295390893, win 26847, options [mss 1460,sackOK,
TS val 192703377 ecr 23947380,nop,wscale 7], length 0

17:03:19.106448 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 1, win 215, options [nop,nop,TS val 23947398 ecr 192703377],
length 0

17:03:19.106665 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [P.],
seq 1:147, ack 1, win 215, options [nop,nop,TS val 23947398
ecr 192703377], length 146

17:03:19.122733 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
ack 147, win 219, options [nop,nop,TS val 192703381 ecr 23947398],
length 0
(*)
17:03:19.123098 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 1:5433, ack 147, win 219, options [nop,nop,TS val 192703381
ecr 23947398], length 5432

17:03:19.123173 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 5433:10865, ack 147, win 219, options [nop,nop,TS val 192703381
ecr 23947398], length 5432

17:03:19.123250 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 10865:13581, ack 147, win 219, options [nop,nop,TS val 192703381
ecr 23947398], length 2716

17:03:19.152621 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 13581:14939, ack 147, win 219, options [nop,nop,TS val 192703389
ecr 23947398], length 1358

17:03:19.152702 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 1, win 236, options [nop,nop,TS val 23947444
ecr 192703381,nop,nop,sack 1 {13581:14939}], length 0

17:03:19.169088 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 1:2717, ack 147, win 219, options [nop,nop,TS val 192703393
ecr 23947444], length 2716

17:03:19.384652 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 1:1359, ack 147, win 219, options [nop,nop,TS val 192703447
ecr 23947444], length 1358

17:03:19.384749 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 1359, win 257, options [nop,nop,TS val 23947676
ecr 192703447,nop,nop,sack 1 {13581:14939}], length 0

17:03:19.401047 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 1359:4075, ack 147, win 219, options [nop,nop,TS val 192703451
ecr 23947676], length 2716

17:03:19.616644 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 1359:2717, ack 147, win 219, options [nop,nop,TS val 192703505
ecr 23947676], length 1358

17:03:19.616746 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 2717, win 278, options [nop,nop,TS val 23947908
ecr 192703505,nop,nop,sack 1 {13581:14939}], length 0

17:03:19.632969 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [P.],
seq 14939:16297, ack 147, win 219, options [nop,nop,TS val 192703509
ecr 23947908], length 1358

17:03:19.633023 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 16297:17655, ack 147, win 219, options [nop,nop,TS val 192703509
ecr 23947908], length 1358

17:03:19.633083 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 2717, win 299, options [nop,nop,TS val 23947925
ecr 192703505,nop,nop,sack 1 {13581:16297}], length 0

17:03:19.633096 IP 10.168.16.27.35050 > 54.76.101.68.http: Flags [.],
ack 2717, win 321, options [nop,nop,TS val 23947925
ecr 192703505,nop,nop,sack 1 {13581:17655}], length 0

17:03:19.649470 IP 54.76.101.68.http > 10.168.16.27.35050: Flags [.],
seq 2717:5433, ack 147, win 219, options [nop,nop,TS val 192703513
ecr 23947925], length 2716

[...]

如果我在CentOS VM的网络接口(eth0)上使用相同的tcpdump,我会得到:

09:57:52.353514 IP 10.168.16.27 > 54.76.101.68: ICMP 10.168.16.27
unreachable - need to frag (mtu 1360), length 556

在我看来,好像前5个包完全匹配。然而,在主机tcpdump(' seq 1:5433',' seq 5433:10865' seq 10865:13581')之后的前三个包(*)不要出现在容器tcpdump中。下一个(' seq 13581:14939')再次出现。

UPDATE2:

在慢速HTTP传输期间,我还看到重复的ICMP数据包来自主机的eth0到源IP:

11:32:04.818451 IP 146.148.30.13 > 172.31.1.77: ICMP 146.148.30.13
unreachable - need to frag (mtu 1360), length 556

ICMP数据包似乎到达目的地,在这种情况下是AWS的Web服务器。该服务器上的转储显示:

{{1}}

所以这确实有一个MTU问题的味道,正如克莱顿在下面提到的那样。为了澄清,我看到了我尝试过的所有来源的延迟,而不仅仅是来自这个特定的网络服务器。

安德烈

0 个答案:

没有答案