Hadoop 272 - Datanodes启动然后停止

时间:2016-02-01 18:40:57

标签: apache hadoop amazon-web-services

Env详细信息:

我在AWS上安装了一个hadoop 2.7.2(不是HW但是纯Hadoop)多节点集群(1 Namenode / 1 2nd NN / 3 datanodes - ubuntu 14.04)。

群集基于以下教程(http://mfaizmzaki.com/2015/12/17/how-to-install-hadoop-2-7-1-multi-node-cluster-on-amazon-aws-ec2-instance-improved-part-1/) - >这意味着第一次安装(主服务器)被复制并调整

问题:

如果我使用1 Datanode配置群集(我特别排除了其他2个),则3个数据节点可以单独正常工作。

一旦我添加另一个数据节点,数据节点首先启动就会记录一个致命错误(请参阅此后的日志文件摘录和VERSION文件的快照)并停止。启动第二个工作的数据节点然后很好......

任何想法推荐? 我在其他机器上克隆主机的AMI时做错了吗? 谢谢伙计们!

日志文件

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Unsuccessfully sent block report 0x1858458671b, containing 1 storage report(s), of which we sent 0. The reports had 0 total blocks and used 0 RPC(s). This took 5 msec to generate and 35 msecs for RPC and NN processing. Got back no commands.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000 is shutting down org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Data node DatanodeRegistration(172.X.X.X:50010, datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-8e09ff25-80fb-4834-878b-f23b3deb62d0;nsid=278157295;c=0) is attempting to report storage ID 54bc8b80-b84f-4893-8b96-36568acc5d4b. Node 172.Z.Z.Z:50010 is expected to serve this storage.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1251070591-172.31.34.94-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) 

INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1251070591-172.31.34.94-1454167071207

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:    /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at HNDATA2/172.X.X.x ************************************************************/

1 个答案:

答案 0 :(得分:0)

您必须在namenode的slaves文件中添加三个数据节点的IP地址。然后重新启动群集。这将解决问题

<强>从站

<IPaddress of datanode1>
<IPaddress of datanode2>
<IPaddress of datanode3>