一段时间后,zookeeper服务器关机

时间:2018-06-12 21:15:25

标签: apache-kafka apache-zookeeper hortonworks-data-platform

我们有3个zookeeper服务器版本3.4.x的HDP集群版本2.6.4

第一个zookeeper服务器不能正常工作并且在一段时间之后弯腰

来自ambari GUI的

我们可以看到动物园已经断开连接

从zookeeper日志中我们可以看到以下内容:

java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,856 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,857 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,857 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,857 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,857 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2018-06-12 18:35:01,857 - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1082)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:391)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)

当我们对动物园管理员进行测试时,我们得到了:

echo stat | nc 14.42.169 2181

Latency min/avg/max: 0/10/2727
Received: 600879
Sent: 103803
Connections: 30
Outstanding: 546
Zxid: 0x3e000048c3
Mode: follower
Node count: 43296
  • 请注意发送比我们从收到的要少得多!

我们可以看到许多CLOSE-WAIT连接

#  ss -anop | grep 2181 | grep CLOSE | awk '{print $1" "$2}' | more
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT
tcp CLOSE-WAIT

为了尝试解决此问题,我们执行了以下操作但未成功

  1. 将Java堆大小增加到8G(仅限zookeeper)

  2. 在kafka上增加zookeeper.session.timeout.ms

  3. 但所有这些都没有帮助我们

    请咨询可能导致此问题的原因,

1 个答案:

答案 0 :(得分:0)

它看起来像是固定的错误https://issues.apache.org/jira/browse/ZOOKEEPER-2044

尝试更新您的动物园管理员