3次重试后,ZooKeeper存在失败

时间:2013-11-27 09:54:39

标签: hadoop hbase apache-zookeeper

我在伪分布式模式下运行Hadoop-1.2.1和HBase-0.94.11。

由于电源故障,Hadoop和HBase设置下降。在我重新启动计算机并设置伪分发的时候,HBase在HBase shell上停止了以下错误:

13/11/27 13:53:27 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
13/11/27 13:53:27 WARN zookeeper.ZKUtil: hconnection Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
    at org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
    at org.apache.hadoop.hbase.zookeeper.ClusterId.getId(ClusterId.java:50)
    at org.apache.hadoop.hbase.zookeeper.ClusterId.hasId(ClusterId.java:44)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:720)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:789)
    at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:129)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

以下是流程:

hduser@user-ubuntu:~$ jps
16914 NameNode
19955 Jps
29460 Main
17728 TaskTracker
19776 HMaster
17490 JobTracker
17392 SecondaryNameNode

2 个答案:

答案 0 :(得分:1)

您确定您的Zookeeper进程正在运行(您的jps列表未显示QuorumPeerMain的条目)吗? jps堆栈可能无法显示正在运行的所有Java进程 - 尝试使用ps axww | grep QuorumPeerMain

如果你的zookeeper拒绝启动,请检查其日志以查看是否有一些堆栈跟踪线索

答案 1 :(得分:0)

动物园管理员仲裁过程没有正常运行 - 如果是的话,那就是另一个java进程:

hduser@user-ubuntu:~$ jps
16914 NameNode
19955 Jps
29460 Main
17728 TaskTracker
19776 HMaster
17490 JobTracker
17392 SecondaryNameNode

<强> xxxxx HQuorumPeer

HBase集群需要

Zookeeper - 因为它管理它。

可能的解决方案: 默认情况下,HBase管理zookeeper本身,即启动和停止zookeeper仲裁(zookeeper节点集群) - 验证设置查看文件conf / hbase-evn.sh(在你的hbase目录中)必须有一行:

export HBASE_MANAGES_ZK=true

基本上告诉HBase是否应该管理自己的Zookeeper实例。如果设置为false,请修改为true

同时验证HBase conf conf/hbase-site.xml

应该适用于伪分布式模式的最小conf:

<configuration>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>
<property>
    <name>hbase.rootdir</name>
   <value>hdfs://localhost:9000/hbase</value>   
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/<yourusername>/zookeeper</value>
  </property>
</configuration>

现在停止HBase,如果它正在运行:

$ ./bin/stop-hbase.sh

进行必要的更改并重新开始:

$ ./bin/start-hbase.sh

您可能会发现有用的答案:1 2