配置HA后,两个名称节点待命

时间:2017-04-03 12:26:17

标签: hadoop

我已在群集中配置了高可用性 它由三个节点组成

hadoop-master(192.168.4.128)(名称节点)

hadoop-slave-1(192.168.4.111)(另一个名称节点)

hadoop-slave-2(192.168.4.106)(数据节点)

没有格式化名称节点(将启用了HA的群集转换为启用HA),如此处所述 https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

但我有两个名称节点作为待机状态 所以我尝试通过应用以下命令将这两个节点之一的转换移动到活动状态

 hdfs haadmin -transitionToActive mycluster --forcemanual

带有以下输出

17/04/03 08:07:35 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-master/192.168.4.128:8020
17/04/03 08:07:36 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-slave-1/192.168.4.111:8020
Illegal argument: Unable to determine service address for namenode 'mycluster'

我的核心网站是

<property>
                 <name>dfs.tmp.dir</name>
                 <value>/opt/hadoop/data15</value>
       </property>
        <property>
           <name>fs.default.name</name>
           <value>hdfs://hadoop-master:8020</value>
       </property>
       <property>
           <name>dfs.permissions</name>
           <value>false</value>
       </property>
       <property>
           <name>dfs.journalnode.edits.dir</name>
           <value>/usr/local/journal/node/local/data</value>
       </property>

        <property>

                <name>fs.defaultFS</name>

                <value>hdfs://mycluster</value>

        </property>


        <property>

                <name>hadoop.tmp.dir</name>

                <value>/tmp</value>


  </property>

我的hdfs-site.xml是

<property>
                 <name>dfs.replication</name>
                 <value>2</value>
        </property>
        <property>
                 <name>dfs.name.dir</name>
                 <value>/opt/hadoop/data16</value>
                 <final>true</final>
        </property>
        <property>
                 <name>dfs.data.dir</name>
                 <value>/opt/hadoop/data17</value>
                 <final>true</final>
        </property>

        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>hadoop-slave-1:50090</value>
        </property>

       <property>

        <name>dfs.nameservices</name>

        <value>mycluster</value>

        <final>true</final>

    </property>

    <property>

        <name>dfs.ha.namenodes.mycluster</name>

        <value>hadoop-master,hadoop-slave-1</value>

        <final>true</final>

    </property>

    <property>

        <name>dfs.namenode.rpc-address.mycluster.hadoop-master</name>

        <value>hadoop-master:8020</value>

    </property>

    <property>

        <name>dfs.namenode.rpc-address.mycluster.hadoop-slave-1</name>

        <value>hadoop-slave-1:8020</value>

    </property>

    <property>

        <name>dfs.namenode.http-address.mycluster.hadoop-master</name>

        <value>hadoop-master:50070</value>

    </property>

    <property>

        <name>dfs.namenode.http-address.mycluster.hadoop-slave-1</name>

        <value>hadoop-slave-1:50070</value>

    </property>

    <property>

        <name>dfs.namenode.shared.edits.dir</name>

        <value>qjournal://hadoop-master:8485;hadoop-slave-2:8485;hadoop-slave-1:8485/mycluster</value>

    </property>

    <property>

        <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

    </property>

    <property>

        <name>ha.zookeeper.quorum</name>



        <value>hadoop-master:2181,hadoop-slave-1:2181,hadoop-slave-2:2181</value>

    </property>

    <property>

        <name>dfs.ha.fencing.methods</name>

        <value>sshfence</value>

    </property>

    <property>

        <name>dfs.ha.fencing.ssh.private-key-files</name>

        <value>root/.ssh/id_rsa</value>

    </property>


    <property>

        <name>dfs.ha.fencing.ssh.connect-timeout</name>

        <value>3000</value>

    </property>

服务地址值应该是多少?我可以按顺序应用哪些可能的解决方案 打开两个节点的一个名称节点到活动状态?

请注意,所有三个节点上的zookeeper服务器都已停止

1 个答案:

答案 0 :(得分:0)

我遇到了同样的问题,结果发现我没有格式化Zookeeper并启动ZKFC