当Zookeeper主节点脱机时,为什么ActiveMQ群集因“server null”而失败?

时间:2015-12-18 15:20:39

标签: activemq apache-zookeeper leveldb

我遇到ActiveMQ的问题,当主Zookeeper节点离线时,整个群集都会失败。

我们在开发环境中设置了3节点ActiveMQ群集。每个节点都有ActiveMQ 5.12.0和Zookeeper 3.4.6(*注意,我们已经使用Zookeeper 3.4.7进行了一些测试,但这未能解决问题。时间限制到目前为止阻止我们测试ActiveMQ 5.13)。 / p>

我们发现当我们停止主ZooKeeper进程(通过任务管理器中的“结束进程树”命令)时,剩下的两个ZooKeeper节点继续正常运行。有时ActiveMQ集群能够处理这个问题,但有时却没有。

当群集出现故障时,我们通常会在ActiveMQ日志中看到这一点:

2015-12-18 09:08:45,157 | WARN  | Too many cluster members are connected.  Expected at most 3 members but there are 4 connected. | org.apache.activemq.leveldb.replicated.MasterElector | WrapperSimpleAppMain-EventThread
...
...
2015-12-18 09:27:09,722 | WARN  | Session 0x351b43b4a560016 for server null, unexpected error, closing socket connection and attempting reconnect | org.apache.zookeeper.ClientCnxn | WrapperSimpleAppMain-SendThread(192.168.0.10:2181)
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.7.0_79]
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)[:1.7.0_79]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]

我们立即担心的事实是(A)ActiveMQ似乎认为群集中只有4个成员配置为3时(B)当引发异常时,服务器似乎为空。然后,我们将ActiveMQ的日志记录级别增加到DEBUG,以显示成员列表:

2015-12-18 09:33:04,236 | DEBUG | ZooKeeper group changed: Map(localhost -> ListBuffer((0000000156,{"id":"localhost","container":null,"address":null,"position":-1,"weight":5,"elected":null}), (0000000157,{"id":"localhost","container":null,"address":null,"position":-1,"weight":1,"elected":null}), (0000000158,{"id":"localhost","container":null,"address":"tcp://192.168.0.11:61619","position":-1,"weight":10,"elected":null}), (0000000159,{"id":"localhost","container":null,"address":null,"position":-1,"weight":10,"elected":null}))) | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-14

有人可以建议为什么会发生这种情况和/或建议解决这个问题的方法吗?我们的配置如下所示:

动物园管理员:

tickTime=2000
dataDir=C:\\zookeeper-3.4.7\\data
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.10:2888:3888
server.2=192.168.0.11:2888:3888
server.3=192.168.0.12:2888:3888

ActiveMQ(server.1):

<persistenceAdapter>    
    <replicatedLevelDB
    directory="activemq-data"
    replicas="3"
    bind="tcp://0.0.0.0:61619"
    zkAddress="192.168.0.11:2181,192.168.0.10:2181,192.168.0.12:2181"
    zkPath="/activemq/leveldb-stores"
    hostname="192.168.0.10"
    weight="5"/>
    //server.2 has a weight of 10, server.3 has a weight of 1
</persistenceAdapter>

0 个答案:

没有答案