我们有一个设置,我们有一个弹性LB,可以在两个Apache服务器A1和A2上传播负载。这些apache服务器呈现一些php页面,主要将API请求重定向到tomcat应用程序服务器T1和T2,如下图所示:
Incoming Request
|
|
|
\/
LB
/\
/ \
/ \
A1 A2
|\ /|
| \ / |
| \ / |
| / \ |
T1 T2
我们最近开始注意到apache和tomcat之间的延迟。以下是来自apache mod_slow日志和tomcat访问日志的相同请求的示例日志行:
APACHE_MOD_SLOW: VNSdtwoAAJkAACnXb-cAAACJ [06/Feb/2015:16:25:51 +0530] elapsed: 50.58 cpu: 0.00(usr)/0.00(sys) pid: 10711 ip: 10.0.0.153 host: www.example.com:443 reqinfo: GET /data/v1/url?url=test-508324 HTTP/1.1
TOMCAT: [06/Feb/2015:16:26:42 +0530] "GET /data/v1/url?url=test-508324 HTTP/1.1" 200 65 10
Apache表示传入的请求来自06/Feb/2015:16:25:51 +0530
,并且50s
处理了请求。而tomcat说只有10ms
才能处理请求,而它在06/Feb/2015:16:26:42 +0530
收到了请求。
这意味着apache需要将近50秒才能连接并将整个请求发送到tomcat。 Apache正在使用mod_proxy_ajp
连接到apache。这是配置:
<Proxy balancer://prod>
BalancerMember ajp://127.0.0.1:8009 route=jvmRoute-8009 connectiontimeout=1 retry=300
BalancerMember ajp://10.0.0.153:8009 route=jvmRoute-8009 connectiontimeout=1 retry=300
ProxySet lbmethod=byrequests
</Proxy>
这是tomcat的连接器配置:
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" maxThreads="4096" minSpareThreads="25" maxSpareThreads="75"/>
根据connectiontimeout值,我假设apache不应该花费超过1秒来建立连接。由于apache和tomcat都在同一台机器上,一旦建立连接就不会有太长的时间滞后。
如果有帮助,我们会使用https
次请求。但是,我不认为这与此有任何关系。我们已经完成ab
测试,以便使用https
,http
和直接连接tomcat来比较性能。以下是统计数据:
ab -n5000 -c5 https://example.com/test/100001
Requests per second: 13.67 [#/sec] (mean)
Time per request: 365.851 [ms] (mean)
Time per request: 73.170 [ms] (mean, across all concurrent requests)
Transfer rate: 79.96 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 236 267 95.5 247 3401
Processing: 83 98 58.6 89 1959
Waiting: 82 96 57.5 87 1959
Total: 319 365 134.0 338 3571
Percentage of the requests served within a certain time (ms)
50% 338
66% 347
75% 356
80% 364
90% 399
95% 477
98% 689
99% 869
100% 3571 (longest request)
ab -n5000 -c5 http://example.com/test/100001
Time per request: 186.015 [ms] (mean)
Time per request: 37.203 [ms] (mean, across all concurrent requests)
Transfer rate: 155.55 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 74 79 33.4 76 1278
Processing: 83 107 82.3 91 3964
Waiting: 82 105 60.9 89 940
Total: 157 186 90.1 168 4042
Percentage of the requests served within a certain time (ms)
50% 168
66% 174
75% 180
80% 184
90% 211
95% 259
98% 379
99% 507
100% 4042 (longest request)
ab -n5000 -c5 http://IP:8080/test/100001
Requests per second: 31.32 [#/sec] (mean)
Time per request: 159.624 [ms] (mean)
Time per request: 31.925 [ms] (mean, across all concurrent requests)
Transfer rate: 181.30 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 71 76 68.4 73 3079
Processing: 78 84 13.1 81 594
Waiting: 77 83 6.5 81 185
Total: 149 159 71.2 154 3313
Percentage of the requests served within a certain time (ms)
50% 154
66% 157
75% 160
80% 161
90% 166
95% 171
98% 177
99% 189
100% 3313 (longest request)
以下观察让我相信,糟糕的表现取决于事件的顺序,因为测试中没有任何请求表现得那么糟糕。
版本: Apache:2.2.4 Tomcat:8.0.16
任何滞后来源的想法以及如何减少这些想法?