配置边缘节点以在专用网络上运行的群集上启动Hadoop作业

时间:2015-05-04 11:27:32

标签: hadoop cloudera

我正在尝试将边缘节点设置到我工作场所的群集中。该集群是CDH 5. * Hadoop Yarn。它拥有自己的内部私人高速网络。边缘节点位于专用网络之外。

我运行了hadoop客户端设置的步骤并配置了core-site.xml
sudo apt-get install hadoop-client

由于群集托管在自己的专用网络上,因此内部网络中的IP地址不同。 10.100.100.1 - Namemode 10.100.100.2 - 数据节点1 10.100.100.4 - 数据节点2 100.100.100.6 - 日期节点3

为了解决这个问题,我请求集群管理员将以下属性添加到namenode上的hdfs-site.xml,以便侦听端口不仅对内部IP范围开放:

<property>
  <name>dfs.namenode.servicerpc-bind-host</name>
  <value>0.0.0.0</value>
</property>
<property>
  <name>dfs.namenode.http-bind-host</name>
  <value>0.0.0.0</value>
</property>
<property>
  <name>dfs.namenode.https-bind-host</name>
  <value>0.0.0.0</value>
</property>
<property>
  <name>dfs.namenode.rpc-bind-host</name>
  <value>0.0.0.0</value>
</property>

完成此设置并重新启动服务。我能够运行以下命令: hadoop fs -ls / user / hduser / testData / XML_Flows / test / test_input / *

这很好用。但是当我尝试捕获文件时,我收到以下错误:

*administrator@administrator-Virtual-Machine:/etc/hadoop/conf.empty$ hadoop fs -cat /user/hduser/testData/XML_Flows/test/test_input/*
*15/05/04 15:39:02 WARN hdfs.BlockReaderFactory: I/O error constructing remote block reader.
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
    at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3035)
    at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:744)
    at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:659)
    at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:327)
    at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:574)
    at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:797)
    at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:844)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
    at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:104)
    at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:99)
    at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
    at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
    at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
    at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
    at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
    at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
    at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
15/05/04 15:39:02 WARN hdfs.DFSClient: Failed to connect to /10.100.100.6:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]*

同一错误信息会重复多次。

我从群集数据节点复制了xml的其余部分,即hdfs-site.xml,yarn-site.xml,mapred-site.xml,以便更安全。但我仍然得到同样的错误。有没有人知道这个错误或如何使边缘节点在专用网络上运行的集群上工作。

边缘节点的用户名是“administrator”,而群集是使用“hduser”id配置的。这可能是个问题吗?我在边缘节点和名称节点之间配置了密码少的登录。

0 个答案:

没有答案