我正在为活着测试做POC,看到了一个奇怪的场景,它开始为所有集群节点计时。我打算使用这个活性测试API在HAProxy中使用应用程序级别的运行状况检查。但是当HAProxy开始将所有群集节点显示为DOWN时,这种情况让我感到害怕,因为在所有群集节点的活动测试超时但是Rabbitmq服务器端口在此期间是可连接的。根据文档,可以假设rabbitmq管理工具(侦听端口15672)可连接,而服务器(侦听端口5672)已启动并运行。
当我重新启动兔节点时,它处于相同的状态。
当我使用kill -9在所有主机中杀死rabbitmq进程并正常启动应用程序时,它恢复并开始返回响应代码200。
如何避免这种情况?
从一个节点------
中输入一些错误日志=INFO REPORT==== 7-Jul-2017::16:05:18 ===
Mirrored queue 'aliveness-test' in vhost '/': Slave <rabbit@pgperf-rabbitmq2.1.26928.0> saw deaths of mirrors <rabbit@pgperf-rabbitmq1.1.18235.0>
=WARNING REPORT==== 7-Jul-2017::16:05:18 ===
Mnesia('rabbit@pgperf-rabbitmq2'): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=ERROR REPORT==== 7-Jul-2017::16:05:18 ===
** Generic server <0.27470.0> terminating
** Last message in was {'DOWN',#Ref<0.0.24.29729>,process,<7887.11927.0>,
noproc}
** When Server state == {state,
{20,<0.27470.0>},
{{17,<7888.18916.0>},#Ref<0.0.24.29728>},
{{10,<7887.11927.0>},#Ref<0.0.24.29729>},
{resource,<<"/">>,queue,<<"aliveness-test">>},
rabbit_mirror_queue_slave,
{21,
[{{10,<7887.11927.0>},
{view_member,
{10,<7887.11927.0>},
[],
{20,<0.27470.0>},
{14,<7888.18825.0>}}},
{{12,<0.26929.0>},
{view_member,
{12,<0.26929.0>},
[],
{14,<7888.18825.0>},
{17,<7888.18916.0>}}},
{{14,<7888.18825.0>},
{view_member,
{14,<7888.18825.0>},
[],
{10,<7887.11927.0>},
{12,<0.26929.0>}}},
{{17,<7888.18916.0>},
{view_member,
{17,<7888.18916.0>},
[],
{12,<0.26929.0>},
{20,<0.27470.0>}}},
{{20,<0.27470.0>},
{view_member,
{20,<0.27470.0>},
[],
{17,<7888.18916.0>},
{10,<7887.11927.0>}}}]},
-1,
[{{10,<7887.11927.0>},{member,{[],[]},1,1}},
{{12,<0.26929.0>},
{member,{[{1,process_death}],[]},1,-1}},
{{14,<7888.18825.0>},{member,{[],[]},0,0}},
{{17,<7888.18916.0>},{member,{[],[]},-1,-1}},
{{20,<0.27470.0>},{member,{[],[]},-1,-1}}],
[<0.27469.0>],
{[],[]},
[],0,undefined,
#Fun<rabbit_misc.execute_mnesia_transaction.1>,
false}
** Reason for termination ==
** {bad_return_value,
{error,
{function_clause,
[{gm,check_membership,
[{20,<0.27470.0>},{error,not_found}],
[{file,"src/gm.erl"},{line,1590}]},
{gm,'-record_dead_member_in_group/5-fun-1-',4,
[{file,"src/gm.erl"},{line,1132}]},
{mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,833}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,808}]},
{rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1,
[{file,"src/rabbit_misc.erl"},{line,537}]},
{worker_pool_worker,'-run/2-fun-0-',3,
[{file,"src/worker_pool_worker.erl"},{line,77}]}]}}}
=ERROR REPORT==== 7-Jul-2017::16:05:18 ===
** Generic server <0.27469.0> terminating
** Last message in was {'EXIT',<0.27470.0>,
{bad_return_value,
{error,
{function_clause,
[{gm,check_membership,
[{20,<0.27470.0>},{error,not_found}],
[{file,"src/gm.erl"},{line,1590}]},
{gm,'-record_dead_member_in_group/5-fun-1-',4,
[{file,"src/gm.erl"},{line,1132}]},
{mnesia_tm,apply_fun,3,
[{file,"mnesia_tm.erl"},{line,833}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,808}]},
{rabbit_misc,
'-execute_mnesia_transaction/1-fun-0-',1,
[{file,"src/rabbit_misc.erl"},{line,537}]},
{worker_pool_worker,'-run/2-fun-0-',3,
[{file,"src/worker_pool_worker.erl"},
{line,77}]}]}}}}
** When Server state == {state,
{amqqueue,
{resource,<<"/">>,queue,<<"aliveness-test">>},
false,false,none,[],<7888.18913.0>,
[<7887.11753.0>],
[<7887.11753.0>],
['rabbit@pgpdr-rabbitmq2'],
[{vhost,<<"/">>},
{name,<<"ha-all">>},
{pattern,<<>>},
{'apply-to',<<"all">>},
{definition,
[{<<"ha-mode">>,<<"all">>},
{<<"ha-sync-mode">>,<<"automatic">>}]},
{priority,0}],
[{<7888.18916.0>,<7888.18913.0>},
{<7887.11774.0>,<7887.11753.0>}],
[],live},
<0.27470.0>,rabbit_priority_queue,
{passthrough,rabbit_variable_queue,
{vqstate,
{0,{[],[]}},
{0,{[],[]}},
{delta,undefined,0,undefined},
{0,{[],[]}},
{0,{[],[]}},
0,
{0,nil},
{0,nil},
{0,nil},
{qistate,
"/paytm/rabbitmq/mnesia/rabbit@pgperf-rabbitmq2/queues/1EZ1LHRKWKS0CPF59OD5ZBSYL",
{{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
[]},
undefined,0,32768,
#Fun<rabbit_variable_queue.2.95522769>,
#Fun<rabbit_variable_queue.3.95522769>,
{0,nil},
{0,nil},
[],[]},
{undefined,
{client_msstate,msg_store_transient,
<<190,153,122,22,186,127,25,5,168,81,229,140,5,
142,73,73>>,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
{state,438349,
"/paytm/rabbitmq/mnesia/rabbit@pgperf-rabbitmq2/msg_store_transient"},
rabbit_msg_store_ets_index,
"/paytm/rabbitmq/mnesia/rabbit@pgperf-rabbitmq2/msg_store_transient",
<0.370.0>,442446,434242,446543,450640,
{2000,500}}},
false,0,4096,0,0,0,0,0,infinity,0,0,0,0,0,0,
{rates,0.0,0.0,0.0,0.0,-576459879667187727},
{0,nil},
{0,nil},
{0,nil},
{0,nil},
0,0,0,0,2048,default}},
undefined,undefined,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
{state,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
delegate},
undefined}
** Reason for termination ==
** {bad_return_value,
{error,
{function_clause,
[{gm,check_membership,
[{20,<0.27470.0>},{error,not_found}],
[{file,"src/gm.erl"},{line,1590}]},
{gm,'-record_dead_member_in_group/5-fun-1-',4,
[{file,"src/gm.erl"},{line,1132}]},
{mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,833}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,808}]},
{rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1,
[{file,"src/rabbit_misc.erl"},{line,537}]},
{worker_pool_worker,'-run/2-fun-0-',3,
[{file,"src/worker_pool_worker.erl"},{line,77}]}]}}}
=WARNING REPORT==== 7-Jul-2017::16:05:18 ===
Mnesia('rabbit@pgperf-rabbitmq2'): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=ERROR REPORT==== 7-Jul-2017::16:47:39 ===
Channel error on connection <0.27776.0> (<rabbit@pgperf-rabbitmq2.1.27776.0>, vhost: '/', user: 'guest'), channel 1:
operation basic.get caused a channel exception not_found: "failed to perform operation on queue 'aliveness-test' in vhost '/' due to timeout"
=ERROR REPORT==== 7-Jul-2017::16:47:39 ===
webmachine error: path="/api/aliveness-test/%2F"
"Not Found"
=ERROR REPORT==== 7-Jul-2017::16:47:39 ===
Channel error on connection <0.10847.1> (<rabbit@pgperf-rabbitmq2.1.10847.1>, vhost: '/', user: 'guest'), channel 1:
operation queue.declare caused a channel exception not_found: "failed to perform operation on queue 'aliveness-test' in vhost '/' due to timeout"
=ERROR REPORT==== 7-Jul-2017::16:47:39 ===
Channel error on connection <0.151.1> (<rabbit@pgperf-rabbitmq2.1.151.1>, vhost: '/', user: 'guest'), channel 1:
operation queue.declare caused a channel exception not_found: "failed to perform operation on queue 'aliveness-test' in vhost '/' due to timeout"