Slave无法连接到master并在hadoop中启动tasktracker或datanode

时间:2012-12-19 09:36:27

标签: hadoop master-slave task-tracking

我正在使用2节点完全分布式hadoop集群。我试图连接tasktracker在从节点上运行,但它无法连接到我的9000/9001端口。下面是配置文件,所以如果有人发现了什么,请大声欢呼!

来自Tasktracker的错误消息(在master上使用start-all运行)

2012-12-19 09:33:03,161 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-12-19 09:33:03,316 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-12-19 09:33:03,320 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-12-19 09:33:03,320 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2012-12-19 09:33:03,888 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-12-19 09:33:04,502 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2012-12-19 09:33:04,755 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2012-12-19 09:33:04,799 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-12-19 09:33:04,807 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as hadoop
2012-12-19 09:33:04,813 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-hadoop/mapred/local
2012-12-19 09:33:04,826 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2012-12-19 09:33:04,856 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2012-12-19 09:33:04,857 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
2012-12-19 09:33:04,920 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2012-12-19 09:33:04,923 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort38644 registered.
2012-12-19 09:33:04,926 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort38644 registered.
2012-12-19 09:33:04,929 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2012-12-19 09:33:04,931 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 38644: starting
2012-12-19 09:33:04,931 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 38644: starting
2012-12-19 09:33:04,932 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 38644: starting
2012-12-19 09:33:04,932 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 38644: starting
2012-12-19 09:33:04,933 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 38644: starting
2012-12-19 09:33:04,935 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:38644
2012-12-19 09:33:04,935 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_10.77.26.116:localhost/127.0.0.1:38644
2012-12-19 09:33:05,980 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:33:06,982 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:33:07,985 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:33:08,987 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:33:09,989 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:33:10,991 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:33:11,994 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:33:12,996 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:33:13,998 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:33:15,001 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:33:15,004 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:33:17,009 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:33:18,011 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:33:19,013 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:33:20,015 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:33:21,018 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:33:22,020 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:33:23,022 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:33:24,026 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:33:25,033 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:33:26,036 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:33:26,039 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:33:28,044 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:33:29,045 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:33:30,048 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:33:31,051 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:33:32,055 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:33:33,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:33:34,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:33:35,063 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:33:36,071 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:33:37,073 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:33:37,083 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:33:39,086 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:33:40,094 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:33:41,097 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:33:42,101 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:33:43,104 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:33:44,107 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:33:45,113 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:33:46,118 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:33:47,122 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:33:48,131 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:33:48,134 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:33:50,137 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:33:51,140 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:33:52,143 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:33:53,145 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:33:54,148 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:33:55,151 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:33:56,154 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:33:57,158 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:33:58,161 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:33:59,167 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:33:59,169 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:34:01,173 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).
2012-12-19 09:34:02,175 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 1 time(s).
2012-12-19 09:34:03,178 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 2 time(s).
2012-12-19 09:34:04,181 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 3 time(s).
2012-12-19 09:34:05,183 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 4 time(s).
2012-12-19 09:34:06,189 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 5 time(s).
2012-12-19 09:34:07,191 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 6 time(s).
2012-12-19 09:34:08,193 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 7 time(s).
2012-12-19 09:34:09,195 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 8 time(s).
2012-12-19 09:34:10,196 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 9 time(s).
2012-12-19 09:34:10,199 INFO org.apache.hadoop.ipc.RPC: Server at ipdiscovermaster.cloudapp.net/168.63.72.148:9001 not available yet, Zzzzz...
2012-12-19 09:34:12,203 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ipdiscovermaster.cloudapp.net/168.63.72.148:9001. Already tried 0 time(s).

MASTER主持文件

#127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
#10.77.42.2 ipdiscovermaster.cloudapp.net
ipdiscoverreg1.cloudapp.net
#10.76.174.108 ipdiscoverreg1.cloudapp.net
ipdiscovermaster.cloudapp.net

MASTER core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ipdiscovermaster.cloudapp.net:9000</value>
</property>
</configuration>

MASTER mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>ipdiscovermaster.cloudapp.net:9001</value>
</property>
</configuration>

MASTER masters file

ipdiscovermaster.cloudapp.net

MASTER奴隶文件

ipdiscovermaster.cloudapp.net
ipdiscoverreg1.cloudapp.net

SLAVE托管文件

#127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
#10.77.42.2 ipdiscovermaster.cloudapp.net
ipdiscoverreg1.cloudapp.net
ipdiscovermaster.cloudapp.net
#10.76.174.108 ipdiscoverreg1.cloudapp.net

SLAVE core-site.xml

    <configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ipdiscovermaster.cloudapp.net:9000</value>
</property>
</configuration>

SLAVE mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>ipdiscovermaster.cloudapp.net:9001</value>
</property>
</configuration>

SLAVE掌握文件

ipdiscovermaster.cloudapp.net

3 个答案:

答案 0 :(得分:1)

您需要检查以下可能性

我很有趣你有检查登录Datanode(192.168.135.111 slave01)这是最好的方法去获取确切的错误

如果您已格式化nameNode

 i)delete temp data folder ..
 ii)recreate it 
 iii)give all the permission to temp folder
 iv)format namenode
 v)start hadoop cluster

答案 1 :(得分:0)

将slave的IP和主机名添加到主机的/ etc / hosts文件中,反之亦然。另外,在hdfs-site.xml文件中添加dfs.data.dir和dfs.name.dir属性。这些值默认为/ temp,在重启时会清空。因此,您可能会丢失信息并在机器重启时遇到一些问题。确保你有正确的名称解析,因为这对正确的hadoop功能非常重要。

答案 2 :(得分:0)

我遇到了类似的问题。日志只显示“重试连接到服务器XXX”。以下是我为解决这个问题所做的工作。只需修改master&amp;从属节点/etc/hosts文件,特别是它自己的主机名和相应的IP。 不要将主机名绑定到127.0.0.1

master中的原始主机文件:

127.0.0.1  master

192.168.135.111 slave01

slave中的原始主机文件:

192.168.135.110 master

127.0.0.1 slave01
master中的

已解决主机文件:

**192.168.135.110** master

192.168.135.111 slave
奴隶中

解析主机文件:

192.168.135.110 master

**192.168.135.111** slave