DataNode无法与名称节点连接-“ org.apache.hadoop.ipc.Client：重试连接到服务器”

时间：2019-08-27 16:11:36

标签： java apache hadoop hdfs hadoop2

我已经部署了带有1个Namenode和2个Datanode的Hadoop 3.1.2集群。 NameNode已启动，secondaryNameNode和ResourceManager也已启动为主节点，但是DataNode无法与NameNode连接，因此没有显示容量。

我一直在尝试找出可能的错误，但到目前为止还没有成功。

在遇到奇怪错误时删除了域解析度：

WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [server]
lim_sbo_bigdata_master: ERROR: Cannot set priority of namenode process 11606
Starting datanodes
Starting secondary namenodes [server]
lim_sbo_bigdata_master: ERROR: Cannot set priority of secondarynamenode process 11825
Starting resourcemanager
Starting nodemanagers


* SELinux is disabled
* IPtables is OPEN for all traffic:

hadoop@lim_server]$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

服务器属于同一网络。

NameNode：

[hadoop@server ~]$ hadoop version
Hadoop 3.1.2
Source code repository https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a
Compiled by sunilg on 2019-01-29T01:39Z
Compiled with protoc 2.5.0
From source with checksum 64b8bdd4ca6e77cce75a93eb09ab2a9
This command was run using /home/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar

[hadoop@server ~]$ jps
27089 Jps
26760 ResourceManager
26491 SecondaryNameNode
26239 NameNode

[hadoop@server ~]$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: 0.00%
Replicated Blocks:
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    Low redundancy blocks with highest priority to recover: 0
    Pending deletion blocks: 0
Erasure Coded Block Groups: 
    Low redundancy block groups: 0
    Block groups with corrupt internal blocks: 0
    Missing block groups: 0
    Low redundancy blocks with highest priority to recover: 0
    Pending deletion blocks: 0

<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>

DataNode错误

[hadoop@server_2]$ jps
17052 DataNode
17166 NodeManager
17406 Jps

2019-08-27 05:46:09,086 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 9867
2019-08-27 05:46:09,229 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:9867
2019-08-27 05:46:09,243 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2019-08-27 05:46:09,251 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2019-08-27 05:46:09,260 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (Datanode Uuid unassigned) service to /10.30.17.228:9000 starting to offer serv
ice
2019-08-27 05:46:09,265 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2019-08-27 05:46:09,265 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9867: starting
2019-08-27 05:46:10,330 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 10.30.17.228/10.30.17.228:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountW
ithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2019-08-27 05:46:11,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 10.30.17.228/10.30.17.228:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountW
ithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

1 个答案:

答案 0 :(得分：0)

尝试将“ localhost”更改为名称节点的实际主机名或IP。