Hadoop 3.2.0在群集中不起作用(VirtualBox)

时间:2019-03-09 13:19:40

标签: apache hadoop hdfs

我正在尝试设置一个具有1个namenode和2个datanodes的VB Hadoop集群以进行测试。我遵循了一些教程,但是当我在namenode中运行start-dfs.sh时,它仅启动namenode进程,而不启动datanode。

我可以逐个启动,但似乎无法在集群中工作。

基本上,我设置了1个服务器(debian 9),为每个VM配置了一个静态IP

hadoop@namenode:~$ cat /etc/hosts
127.0.0.1   localhost namenode
192.168.10.100 namenode.com
192.168.10.161 datanode1.com
192.168.10.162 datanode2.com
hadoop@namenode:~$ cat hadoop/etc/hadoop/slaves
datanode1.com
datanode2.com
hadoop@namenode:~$ cat hadoop/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://namenode.com:9000</value>
        </property>
</configuration>
hadoop@namenode:~$ cat hadoop/etc/hadoop/slaves
datanode1.com
datanode2.com
hadoop@namenode:~$ cat hadoop/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
            <name>dfs.namenode.name.dir</name>
            <value>/home/hadoop/data/nameNode</value>
    </property>
    <property>
            <name>dfs.datanode.data.dir</name>
            <value>/home/hadoop/data/dataNode</value>
    </property>
    <property>
            <name>dfs.replication</name>
            <value>1</value>
    </property>
</configuration>

复制所有VM中的所有配置,输入到namenode并使用hdfs namenode -format

格式化

如果我检查所有服务器上的clusterId是否一致

hadoop@namenode:~$ cat data/dataNode/current/VERSION
#Sat Mar 09 07:58:36 EST 2019
storageID=DS-cc3b3c25-46c8-467c-8a7b-2311f82e9790
clusterID=CID-b0b63b58-73bd-4e6b-85cd-31c353052db6
cTime=0
datanodeUuid=d9a14382-7694-476c-864b-9164de01a92e
storageType=DATA_NODE
layoutVersion=-57
hadoop@namenode:~$ cat data/nameNode/current/VERSION
#Sat Mar 09 07:55:26 EST 2019
namespaceID=1109263708
clusterID=CID-b0b63b58-73bd-4e6b-85cd-31c353052db6
cTime=1551735568343
storageType=NAME_NODE
blockpoolID=BP-1318860827-127.0.0.1-1551735568343
layoutVersion=-65

我没有在日志中看到任何奇怪的东西,而不是

hadoop@namenode:~$ cat hadoop/logs/* | grep ERROR
2019-03-04 17:40:24,433 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,441 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: RECEIVED SIGNAL 1: SIGHUP
2019-03-09 07:57:10,818 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,397 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,417 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 1: SIGHUP
2019-03-09 07:57:09,420 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:29:25,258 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,434 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,441 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 1: SIGHUP
2019-03-04 17:40:24,420 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: RECEIVED SIGNAL 15: SIGTERM
2019-03-04 17:40:24,430 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: RECEIVED SIGNAL 1: SIGHUP
2019-03-04 17:40:24,593 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2019-03-04 17:40:24,791 ERROR org.apache.hadoop.yarn.event.EventDispatcher: Returning, interrupted : java.lang.InterruptedException
2019-03-04 17:40:24,797 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2019-03-04 17:40:24,406 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: RECEIVED SIGNAL 15: SIGTERM
cat: hadoop/logs/userlogs: Is a directory
2019-03-04 17:40:24,418 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: RECEIVED SIGNAL 1: SIGHUP
2019-03-09 07:57:14,149 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: RECEIVED SIGNAL 15: SIGTERM

我已经尝试删除数据文件夹并重新格式化,但仍然无法正常工作

有什么主意吗?

1 个答案:

答案 0 :(得分:0)

经过几天的努力,我意识到问题是: -在后续教程中,请确保核心站点xml具有该属性 fs.defaultFS而非fs.default.name -其次,我总是将datanodes添加到/etc/hadoop/slaves,但是我丢失了/etc/hadoop/workers文件

在其中添加完之后,我重新格式化并重新启动集群,它可以正常工作