我正在尝试构建具有故障转移功能的Hadoop架构。 我的问题是我无法使用HDFS HA正确配置RegionServer。我在RegionServer日志中有以下错误
java.io.IOException: Port 9000 specified in URI hdfs://HAcluster:9000 but host 'HAcluster' is a logical (HA) namenode and does not use port information.
at org.apache.hadoop.hdfs.NameNodeProxies.getFailoverProxyProviderClass(NameNodeProxies.java:396)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:134)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
at org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:2508)
at org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:2492)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:62)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2543)
在我的组件下面:
关于架构,我有6个VM:
一个主服务器处于活动状态,另一个处于待机状态(如果第一个服务器崩溃)。备用主服务器只是活动主服务器的复制
每个组件都处于HA(高可用性)模式。为此,我必须为HDFS创建逻辑集群。 YARN
在不同的文件下方,它们可能有助于更好地理解:
hdfs-site.xml(定义了HAcluster) - 3个服务器相同,但HA可用性范围之外的某些属性除外
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>the value is the number of the copy of the file in the file system</description>
</property>
<!-- High Availability Hadoop -->
<property>
<name>dfs.nameservices</name>
<value>HAcluster</value> <!-- HAcluster is consisted of SUNRAY009IV06 = MASTER 1 and SUNRAY009IV07 = MASTER 2 -->
<final>true</final>
<description>The name of your cluster which consists of Master 1 and Master 2</description>
</property>
<property>
<name>dfs.ha.namenodes.HAcluster</name>
<value>SUNRAY009IV06,SUNRAY009IV07</value> <!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2 -->
<final>true</final>
<description>The namenodes in your cluster</description>
</property>
<property>
<name>dfs.namenode.rpc-address.HAcluster.SUNRAY009IV06</name>
<value>SUNRAY009IV06:9000</value> <!--SUNRAY009IV06 = MASTER 1 -->
<description>the RPC adress of your Master 1</description>
</property>
<property>
<name>dfs.namenode.rpc-address.HAcluster.SUNRAY009IV07</name>
<value>SUNRAY009IV07:9000</value> <!--SUNRAY009IV07 = MASTER 2 -->
<description>the RPC adress of your Master 2</description>
</property>
<property>
<name>dfs.namenode.http-address.HAcluster.SUNRAY009IV06</name>
<value>SUNRAY009IV06:50070</value> <!--SUNRAY009IV06 = MASTER 1 -->
<description>the HTTP adress of your Master 1</description>
</property>
<property>
<name>dfs.namenode.http-address.HAcluster.SUNRAY009IV07</name>
<value>SUNRAY009IV07:50070</value> <!--SUNRAY009IV07 = MASTER 2 -->
<description>the HTTP adress of your Master 2</description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://SUNRAY009IV06:8485;SUNRAY009IV07:8485;SUNRAY009IV08:8485/HAcluster</value>
<!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2, SUNRAY009IV08 = SLAVE 1 -->
<description>the location of the shared storage directory</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.HAcluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>the Java class that HDFS clients use to contact the Active NameNode</description>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>disable hdfs permissions</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>The backup is defined as automatic</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>SUNRAY009IV09:2181,SUNRAY009IV11:2181,SUNRAY009IV13:2181</value>
<description>The list of your Zookeeper servers in your Hadoop architecture</description>
<!--SUNRAY009IV09 = ZOOKEEPER 1, SUNRAY009IV11 = ZOOKEEPER 2, SUNRAY009IV13 = ZOOKEEPER 3 -->
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
<description> method which will be used to fence the Active NameNode during a failover.
sshfence = SSH to the Active NameNode and kill the process</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoopuser/.ssh/id_rsa</value>
<description>List of SSH private key files</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>3000</value>
<description>timeout</description>
</property>
yarn-site.xml - 除了HA可用性范围之外的一些属性外,3台服务器相同
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>HAyarn</value>
<!--HAyarn is consisted of SUNRAY009IV06 = MASTER 1 and SUNRAY009IV07 = MASTER 2 -->
<description>The name of the Resource Manager</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>to enable YARN logs</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
<description>Where to store logs in HDFS</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
<description>mapreduce_shuffle service to implement</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>HAyarn:8031</value>
<!--HAyarn is consisted of SUNRAY009IV06 = MASTER 1 and SUNRAY009IV07 = MASTER 2 -->
<description>host is the hostname of the resource manager and the port is the port on which the NodeManagers contact the Resource Manage</description>
</property>
<!-- High Availability YARN -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>HAyarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>SUNRAY009IV06</value>
<!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2-->
<description>The hostname of MASTER 1</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>SUNRAY009IV07</value>
<!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2-->
<description>The hostnameof MASTER 2</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>SUNRAY009IV06:8088</value>
<!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2-->
<description>The Web application address of MASTER 1</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>SUNRAY009IV07:8088</value>
<!--SUNRAY009IV06 = MASTER 1, SUNRAY009IV07 = MASTER 2-->
<description>The Web application address of MASTER 2</description>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>SUNRAY009IV09:2181,SUNRAY009IV11:2181,SUNRAY009IV13:2181</value>
<description>The list of your Zookeeper servers in your Hadoop architecture</description>
<!--SUNRAY009IV09 = ZOOKEEPER 1, SUNRAY009IV11 = ZOOKEEPER 2, SUNRAY009IV13 = ZOOKEEPER 3 -->
</property>
<property>
<name>yarn.client.failover-proxy-provider.HAyarn</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
<description>the class used for the YARN failover</description>
</property>
hbase-site.xml(3台服务器中相同)
<property>
<name>hbase.rootdir</name>
<value>hdfs://HAcluster/hbase</value> <!--HAcluster is consisted of SUNRAY009IV06 = MASTER 1 and SUNRAY009IV07 = MASTER 2 -->
<description>The directory shared by RegionServers (slaves)</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>SUNRAY009IV09,SUNRAY009IV11,SUNRAY009IV13</value>
<descrption>The list of your Zookeeper servers in your Hadoop architecture</descrption>
<!--SUNRAY009IV09 = ZOOKEEPER 1, SUNRAY009IV11 = ZOOKEEPER 2, SUNRAY009IV13 = ZOOKEEPER 3 -->
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored.</description>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
<description>The root znode that will contain all the znodes created/used byHBase</description>
</property>
hbase-env.sh - 只有有用的部分
#Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
在发布之前,我曾在Google上进行过研究。没有什么对我有用,所以我做了一些尝试: - 我尝试更改HBASE版本。我下载了最后一个(0.98.17-hadoop2)。没有效果 - 我尝试从头开始这意味着:格式化HDFS,删除Zookeeper元数据,删除znodes等... - 我尝试在每个有HBASE的服务器上用hdfs:// MASTER1:9000 / hbase替换hdfs:// HAcluster / hbase。没效果。
所以我有点迷失,因为即使没有逻辑群集,我仍然会遇到错误。
PS:其余所有工作都按预期工作:datanode / nodemanager连接到活动的namenode / resourcemanager(使用Web应用程序检查) HBASE主服务器也正常运行,备份主服务器也被考虑在内(使用webapp检查) 这也是我不明白我有这个错误的原因
我希望我能给你所有正确理解我的问题的元素