当MSS超过

时间:2015-05-27 11:03:22

标签: java linux tcp quickfixj

尝试解决传出消息上存在大量延迟的问题,这些延迟似乎与套接字刷新行为有关。我一直在将传出的FIX消息的数据包捕获从quickfixj发起者带到接受者。

为了总结环境,java intiator与另一台服务器上的服务器套接字建立套接字连接。两台服务器都运行Redhat Enterprise Linux 5.10。来自接口上的netstat的MSS为0. NIC的MTU都是1500(我相信环回接口是无限的)。在应用程序端,消息通过quickfixj编码为字节数组并写入套接字。套接字配置为启用TCP_NODELAY。

我几乎可以肯定我可以消除应用程序作为延迟的原因,因为当使用环回接口在与Initiator相同的服务器上运行acceptor(ServerSocket)时,没有发送方延迟。这是使用环回接口的一些数据包捕获条目的示例:

"No.","Time","Source","Destination","Protocol","Length","SendingTime (52)","MsgSeqNum (34)","Destination Port","Info","RelativeTime","Delta","Push"
"0.001606","10:23:29.223638","127.0.0.1","127.0.0.1","FIX","1224","20150527-09:23:29.223","5360","6082","MarketDataSnapshotFullRefresh","0.001606","0.000029","Set"
"0.001800","10:23:29.223832","127.0.0.1","127.0.0.1","FIX","1224","20150527-09:23:29.223","5361","6082","MarketDataSnapshotFullRefresh","0.001800","0.000157","Set"
"0.001823","10:23:29.223855","127.0.0.1","127.0.0.1","FIX","1224","20150527-09:23:29.223","5362","6082","MarketDataSnapshotFullRefresh","0.001823","0.000023","Set"
"0.002105","10:23:29.224137","127.0.0.1","127.0.0.1","FIX","825","20150527-09:23:29.223","5363","6082","MarketDataSnapshotFullRefresh","0.002105","0.000282","Set"
"0.002256","10:23:29.224288","127.0.0.1","127.0.0.1","FIX","2851","20150527-09:23:29.224,20150527-09:23:29.224,20150527-09:23:29.224","5364,5365,5366","6082","MarketDataSnapshotFullRefresh","0.002256","0.000014","Set"
"0.002327","10:23:29.224359","127.0.0.1","127.0.0.1","FIX","825","20150527-09:23:29.224","5367","6082","MarketDataSnapshotFullRefresh","0.002327","0.000071","Set"
"0.287124","10:23:29.509156","127.0.0.1","127.0.0.1","FIX","1079","20150527-09:23:29.508","5368","6082","MarketDataSnapshotFullRefresh","0.287124","0.284785","Set"

感兴趣的主要事情是1 /尽管数据包长度(这里最大的是2851),PUSH标志在每个数据包上设置。 2 /我在这里测量的延迟度量是在编码之前由消息设置的“发送时间”,以及数据包捕获时间“时间”。数据包捕获在与发送数据的Initiator相同的服务器上完成。对于10,000个数据包的数据包捕获,使用环回时“SendingTime”和“Time”之间没有太大区别。出于这个原因,我认为我可以消除应用程序作为发送延迟的原因。

当接受器移动到LAN上的另一台服务器时,对于大于MTU大小的数据包,发送延迟开始变得更糟。这是捕获的片段:

"No.","Time","Source","Destination","Protocol","Length","SendingTime (52)","MsgSeqNum (34)","Destination Port","Info","RelativeTime","Delta","Push"
"68.603270","10:35:18.820635","10.XX.33.115","10.XX.33.112","FIX","1223","20150527-09:35:18.820","842","6082","MarketDataSnapshotFullRefresh","68.603270","0.000183","Set"
"68.603510","10:35:18.820875","10.XX.33.115","10.XX.33.112","FIX","1223","20150527-09:35:18.820","843","6082","MarketDataSnapshotFullRefresh","68.603510","0.000240","Set"
"68.638293","10:35:18.855658","10.XX.33.115","10.XX.33.112","FIX","1514","20150527-09:35:18.821","844","6082","MarketDataSnapshotFullRefresh","68.638293","0.000340","Not set"
"68.638344","10:35:18.855709","10.XX.33.115","10.XX.33.112","FIX","1514","20150527-09:35:18.821","845","6082","MarketDataSnapshotFullRefresh","68.638344","0.000051","Not set"

这里重要的是当数据包小于MSS(从MTU派生)时,PUSH标志被设置并且没有发送方延迟。这是预期的,因为禁用Nagle的算法将导致在这些较小的数据包上设置PUSH。当数据包大小大于MSS时 - 在这种情况下数据包大小为1514 - 捕获数据包的时间与SendingTime之间的差异已跃升至35ms。

这35ms延迟似乎不太可能是由编码消息的应用程序引起的,因为大数据包大小的消息是在环回接口上以<1ms发送的。捕获也发生在发送方,因此MTU分段似乎也不是原因。在我看来,最可能的原因是因为没有设置PUSH标志 - 因为数据包大于MSS - 所以OS级别的套接字和/或TCP堆栈决定在35ms之后将其刷新。另一台服务器上的测试接受器不是慢速消费者,并且位于同一局域网上,因此ACK是及时的。

任何人都可以提供任何可能导致此套接字为&gt;发送延迟的指针MSS包?与美国的真正交易对手相比,此发件人延迟高达300毫秒。我想如果数据包大小大于MSS那么无论之前的ACKS如何都会立即发送(只要不超过套接字缓冲区大小)。 Netstat通常显示0插槽q和风大小,并且问题似乎发生在所有&gt; MSS数据包,即使是从启动。这看起来套接字由于某种原因决定不立即刷新,但不确定哪些因素可能导致这种情况。

编辑:正如EJP所指出的,Linux中没有刷新。根据我的理解,套接字发送将数据放入linux kernal的网络缓冲区。对于这些非推送数据包,内核在发送之前正在等待来自前一个数据包的确认。这不是我所期望的,在TCP中,我希望在套接字缓冲区填满之前仍然可以传送数据包。

1 个答案:

答案 0 :(得分:1)

这不是一个全面的答案,因为TCP行为会因很多因素而有所不同。但在这种情况下,这就是我们面临问题的原因。

TCP拥塞控制实现中的拥塞窗口允许在没有确认的情况下发送越来越多的分组,只要它没有检测到拥塞的迹象,即重传。一般而言,当这些发生时,拥塞算法将重置拥塞窗口,从而限制在可以发送确认之前可以发送的分组。这体现在我们目睹的发送方延迟中,因为数据包保存在内核缓冲区中,等待先前数据包的确认。没有TCP_NODELAY,TCP_CORK等类型指令会覆盖这方面的拥塞控制行为。

在我们的案例中,由于到另一个地点的往返时间较长,情况变得更糟。但是,因为它是一个每天丢包很少的专用线路,所以不是重启是导致拥塞控制的原因。最后它似乎已经通过在linux中禁用以下标志来解决。这也会导致拥塞窗口被重置,但是通过检测空闲而不是丢包:

&#34; tcp_slow_start_after_idle - BOOLEAN 如果设置,则提供RFC2861行为并使拥塞超时 闲置期后的窗口。空闲时段定义为 目前的RTO。如果未设置,则拥塞窗口不会 闲置一段时间后才能超时。 默认值:1

(请注意,如果您遇到这些问题,也可以调查其他形式的拥塞控制算法,而不是您的内核当前可能设置的那种)。