无法启动namenode.java.lang.IllegalStateException

时间:2017-08-05 07:27:25

标签: hadoop

iam使用hadoop apache 2.7.1高可用性集群 两个名称节点mn1,mn2和3个日志节点

但是当我在群集上工作时,我面临以下错误

当我发出start-dfs.sh时,mn1处于待机状态且mn2处于活动状态

但之后如果其中一个名字节点关闭则没有可能 再打开它  这是这两个名称节点之一的最后一行日志

2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 3 entries 72 lookups
2017-08-05 09:37:21,088 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 7052 msecs
2017-08-05 09:37:21,300 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to mn2:8020
2017-08-05 09:37:21,304 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-08-05 09:37:21,316 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2017-08-05 09:37:21,353 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2017-08-05 09:37:21,354 WARN org.apache.hadoop.hdfs.server.common.Util: Path /opt/hadoop/metadata_dir should be specified as a URI in configuration files. Please update hdfs configuration.
2017-08-05 09:37:21,361 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.IllegalStateException
        at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
        at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:119)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:5741)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:678)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:664)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2017-08-05 09:37:21,364 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-08-05 09:37:21,365 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at mn2/192.168.25.22
************************************************************/

3 个答案:

答案 0 :(得分:0)

这可能是

1.Namenode PORT may be Change for each NODE.

答案 1 :(得分:0)

This is a particularly vexing problem.

  1. Swallow IllegalStateExceptions thrown by removeShutdownHook in FileSystem. The javadoc states:

    public boolean removeShutdownHook(Thread hook) Throws: IllegalStateException - If the virtual machine is already in the process of shutting down

So if we are getting this exception, it MEANS we are already in the process of shutdown, so we CANNOT, try what we may, removeShutdownHook. If Runtime had a method Runtime.isShutdownInProgress(), we could have checked for it before the removeShutdownHook call. As it stands, there is no such method. In my opinion, this would be a good patch regardless of the needs for this JIRA.

  1. Not send SIGTERMs from the NM to the MR-AM in the first place. Rather we should expose a mechanism for the NM to politely tell the AM its no longer needed and should shutdown asap. Even after this, if an admin were to kill the MRAppMaster with a SIGTERM, the JobHistory would be lost defeating the purpose of 3614

答案 2 :(得分:0)

我发现我的问题是在日志节点而不是在namenode中 即使namenode的日志显示了问题中提到的错误

jps显示了日志节点,但它是假的,因为日志节点服务已关闭 即使它是在jps输出中找到的

所以作为解决方案我发布hadoop-daemon.sh停止journalnode 然后hadoop-daemon.sh启动journalnode

然后namenode再次开始工作