我有一个由3个从属和1个主服务器组成的Hadoop集群,其中有一个HBase集群,分别有3个RS和1个主服务器。此外,3台机器上还有一个Zookeeper合奏。
Hadoop集群以及Zookeeper集合正常运行。但是,HBase集群无法正确初始化。
我通过运行./bin/start-hbase.sh
启动HBase。 正确启动HBase主服务器和区域服务器。 hdfs中的hbase文件夹设置正确。
jps on master
hduser@master:~/hbase$ jps
5694 HMaster
3934 JobHistoryServer
3786 NameNode
3873 ResourceManager
6025 Jps
奴隶的jps
5737 Jps
5499 HRegionServer
3736 DataNode
3820 NodeManager
但是,HBase主服务器不会注册区域服务器,因为通过查看日志也很明显:
主日志
[master:master:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1511 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
奴隶日志
[regionserver60020] regionserver.HRegionServer: reportForDuty to master=master,60000,1404856451890 with port=60020, startcode=1404856453874
[regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending local=/10.0.2.15:53939 remote=master/192.168.66.60:60000]
以下是配置详情:
/ etc / hosts on master
192.168.66.63 slave-3 # Data Node and Region Server
192.168.66.60 master # Name Node and HBase Master
192.168.66.73 zookeeper-3 # Zookeeper node
192.168.66.71 zookeeper-1 # Zookeeper node
192.168.66.72 zookeeper-2 # Zookeeper node
192.168.66.62 slave-2 # Data Node and Region Server
192.168.66.61 slave-1 # Data Node and Region Server
/ etc / hosts on slave-1
192.168.66.60 master
192.168.66.73 zookeeper-3
192.168.66.71 zookeeper-1
192.168.66.72 zookeeper-2
所有群集节点上的hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.tmp.dir</name>
<value>/home/hduser/hbase/tmp</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.local.dir</name>
<value>/home/hduser/hbase/local</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>6010</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zookeeper-1,zookeeper-2,zookeeper-3,</value>
</property>
</configuration>
主服务器和从服务器上的regionservers文件
slave-3
slave-1
slave-2
主人和奴隶的hbase-env.sh
export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::"
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_MANAGES_ZK=false
我做错了什么,以至于节点无法相互通信?我在Ubuntu Trusty Tahr x64上使用Hadoop 2.4.0和Hbase 0.98.3以及Zookeeper 3.4.6。
答案 0 :(得分:1)
Ian Brooks在HBase mailing list
上解决了我的谜团基本上我需要在奴隶的/etc/hosts
中手动指定奴隶(我怀疑我只需要添加奴隶本身),这样我最终得到的结果如下:
/ etc / hosts(RS)
192.168.66.60 master
192.168.66.73 zookeeper-3
192.168.66.71 zookeeper-1
192.168.66.72 zookeeper-2
192.168.66.61 slave-1
192.168.66.62 slave-2
192.168.66.63 slave-3
原因是在从站上运行了eth接口,而localhost在不同的IP上进行了寻址。