我使用的是Hadoop 2.6.0-cdh5.6.0。我已经配置了HA。我正在显示活动(NN1)和备用名称节点(NN2)。现在,当我向活动名称节点(NN1)发出终止信号时,备用名称节点(NN2)在我再次启动NN1之前不会变为活动状态。再次启动NN1后,它处于待机状态,NN2进入活动状态。我没有配置“ha.zookeeper.session-timeout.ms”参数,所以我假设它默认为5秒。在检查主动和备用NN之前,我正等待时间完成。
我的core-site.xml
get_all_urls(letters):
urls = []
for letter in letters:
top_url = ...
players = re.findall(...)
for player in players:
urls.append(''.join(top_url, player))
return urls
我的hdfs-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster/</value>
</property>
<property>
<name>hadoop.proxyuser.mapred.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.mapred.hosts</name>
<value>*</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>172.17.5.107:2181,172.17.3.88:2181,172.17.5.128:2181</value>
</property>
</configuration>
我的zoo.cfg
<configuration>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/1/dfs/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/1/dfs/dn</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>172.17.5.107:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>172.17.3.88:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>172.17.5.107:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>172.17.3.88:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://172.17.5.107:8485;172.17.3.88:8485;172.17.5.128:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/1/dfs/jn</value>
</property>
</configuration>
答案 0 :(得分:1)
sshfence存在问题。授予hdfs用户权限或将其更改为root用户
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence(root)</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/var/lib/hadoop-hdfs/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/1/dfs/jn</value>
</property>
</configuration>