Zookeeper随机丢失节点

时间:2016-03-01 20:53:09

标签: apache-zookeeper

环境:zookeeper v3.4.8,jre 1.8.0_73,Windows 7

设置三个节点。一切都好了但是在随机的时间(> 5分钟)之后,法定人数失败了。我已经尝试将ticktime从默认值2000调整到10000,这是Windows tick中的毫秒数。任何想法,为什么我失去一个节点?

实际IP已替换为xx.x.xx 节点1错误:

2016-03-01 14:26:44,836 [myid:1] - WARN  [WorkerSender[myid=1]:QuorumCnxManager@400] - Cannot open channel to 2 at election address /xx.x.xx.52:3889
java.net.ConnectException: Connection refused: connect
    at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:354)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
    at java.lang.Thread.run(Thread.java:745)
2016-03-01 14:26:44,836 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumPeer$QuorumServer@149] - Resolved hostname: xx.x.xx.52 to address: /xx.x.xx.52
2016-03-01 14:26:45,103 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@326] - Getting a diff from the leader 0x0
2016-03-01 14:26:45,118 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - Snapshotting: 0x0 to C:\zookeeper-3.4.8\data\version-2\snapshot.0
2016-03-01 14:26:45,822 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@600] - Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-03-01 14:26:48,693 [myid:1] - INFO  [/xx.x.xx.82:3888:QuorumCnxManager$Listener@541] - Received connection request /xx.x.xx.52:60858
2016-03-01 14:26:48,693 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@600] - Notification: 1 (message format version), 2 (n.leader), 0x100000000 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-03-01 14:41:46,010 [myid:1] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@810] - Connection broken for id 3, my id = 1, error =
java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(SocketInputStream.java:189)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.net.SocketInputStream.read(SocketInputStream.java:203)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
2016-03-01 14:41:46,011 [myid:1] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
2016-03-01 14:41:46,011 [myid:1] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@727] - Interrupted while waiting for message on queue
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
2016-03-01 14:41:46,013 [myid:1] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@736] - Send worker leaving thread

节点2错误:

2016-03-01 14:41:48,977 [myid:2] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@810] - Connection broken for id 3, my id = 2, error =
java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.io.DataInputStream.readInt(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
2016-03-01 14:41:48,977 [myid:2] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
2016-03-01 14:41:48,977 [myid:2] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@727] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown Source)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
    at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
2016-03-01 14:41:48,977 [myid:2] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@736] - Send worker leaving thread

节点3错误:

2016-03-01 14:41:45,460 [myid:3] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@810] - Connection broken for id 1, my id = 3, error =
java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.io.DataInputStream.readInt(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
2016-03-01 14:41:45,460 [myid:3] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
2016-03-01 14:41:45,460 [myid:3] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@727] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown Source)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
    at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
2016-03-01 14:41:45,460 [myid:3] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@736] - Send worker leaving thread
2016-03-01 14:41:48,393 [myid:3] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@810] - Connection broken for id 2, my id = 3, error =
java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.io.DataInputStream.readInt(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
2016-03-01 14:41:48,393 [myid:3] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
2016-03-01 14:41:48,393 [myid:3] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@727] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown Source)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
    at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
2016-03-01 14:41:48,393 [myid:3] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@736] - Send worker leaving thread

1 个答案:

答案 0 :(得分:0)

我也在其中一台机器上安装了Vmware。我们发现(netstat -an)即使我们在zoo.cfg中指定了Windows以太网适配器的IP,因此zookeeper实际上已绑定到VmWare创建的虚拟以太网适配器的IP。一旦我们禁用虚拟以太网适配器,一切正常。