由于异常,Hadoop namenode在调用setSafeMode时未能离开安全模式

时间:2013-03-27 17:00:51

标签: hadoop

我正在运行一个hadoop集群(版本:cdh4.1.1)。我设置了两个HA名称节点。

第1步。

当我尝试启动我的名字节点时,我遇到了这个例外:

2013-03-27 16:52:21,282 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: Cannot start an HA namenode with name dirs that need recovery. Dir: Storage Directory /data/dfs/nn state: NOT_FORMATTED
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:288)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1128)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1192)
2013-03-27 16:52:21,285 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1

第2步。

然后我试着跑:sudo hdfs namenode -recover,我得到了:

13/03/27 16:53:37 INFO hdfs.StateChange: STATE* Safe mode is ON. 
Use "hdfs dfsadmin -safemode leave" to turn safe mode off.

第3步。

按照说明操作,我做了sudo hdfs dfsadmin -safemode leave,我得到了:

WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
13/03/27 16:55:17 WARN retry.RetryInvocationHandler: Exception while invoking setSafeMode of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 996ms.
13/03/27 16:55:18 WARN retry.RetryInvocationHandler: Exception while invoking setSafeMode of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over after sleeping for 2085ms.
......retrying......
Not retrying because failovers (15) exceeded maximum allowed (15)
java.net.ConnectException: Call From namenode-01.local/10.**.**.24 to namenode-02.local:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

任何想法都受到高度赞赏。

1 个答案:

答案 0 :(得分:2)

除非有一些奇怪的魔法,否则你忘了格式化namenode(正如已经说明的那样)。如果您还没有这样做,请运行hadoop -namenode format。请注意,如果 格式化了您的名称节点,则这是破坏性的。