运行示例jar时,Hadoop 2.6多节点集群在连接异常时失败

时间:2015-05-25 19:34:26

标签: hadoop connection-refused

任何示例hadoop 2.6 mapreduce应用程序都给出相同的错误 - java.net.ConnectException:Connection refused;错误输出是:

    hduser@localhost:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /usr/local/hadoop/input  output_wordcount
15/05/26 06:01:14 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.111.72:8040
15/05/26 06:01:15 INFO input.FileInputFormat: Total input paths to process : 1
15/05/26 06:01:15 INFO mapreduce.JobSubmitter: number of splits:1
15/05/26 06:01:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1432599812585_0002
15/05/26 06:01:16 INFO impl.YarnClientImpl: Submitted application application_1432599812585_0002
15/05/26 06:01:16 INFO mapreduce.Job: The url to track the job: http://localhost.localdomain:8088/proxy/application_1432599812585_0002/
15/05/26 06:01:16 INFO mapreduce.Job: Running job: job_1432599812585_0002
15/05/26 06:01:37 INFO mapreduce.Job: Job job_1432599812585_0002 running in uber mode : false
15/05/26 06:01:37 INFO mapreduce.Job:  map 0% reduce 0%
15/05/26 06:01:37 INFO mapreduce.Job: Job job_1432599812585_0002 failed with state FAILED due to: Application application_1432599812585_0002 failed 2 times due to Error launching appattempt_1432599812585_0002_000002. Got exception: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost.localdomain:56148 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy31.startContainers(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 9 more
. Failing the application.
15/05/26 06:01:37 INFO mapreduce.Job: Counters: 0

我的/ etc / hosts看起来像这样 -

127.0.0.1       localhost.localdomain localhost
127.0.1.1       ubuntu-Standard-PC-i440FX-PIIX-1996

192.168.111.72  master
192.168.111.65  slave1
192.168.111.66  slave2

# The following lines are desirable for IPv6 capable hosts
#::1     ip6-localhost ip6-loopback
#fe00::0 ip6-localnet
#ff00::0 ip6-mcastprefix
#ff02::1 ip6-allnodes
#ff02::2 ip6-allrouters

在尝试了许多其他可能性之后,我已经评论了ipv6线路。我想知道错误实际上在哪里。在此先感谢您的回复。

感谢您的回复@Ashok。但主人和奴隶的jps显示所有恶魔都在奔跑。附加输出 -

主     hduser @ localhost:〜$ jps     23518 Jps     10442 NameNode     10752 SecondaryNameNode     12348 ResourceManager

SLAVE1     hduser @ localhost:〜$ jps     28691 NodeManager     13987 Jps     27298 DataNode

对于slave2也一样。

2 个答案:

答案 0 :(得分:1)

找到解决方案!!

Call From localhost.localdomain/127.0.0.1 to localhost.localdomain:56148 failed on connection exception: java.net.ConnectException: Connection refused;

master和slave都在/ etc / hostname中拥有localhost.localdomain的主机名 我将slave的主机名更改为slave1和slave2。那很有效。 谢谢大家的时间。

答案 1 :(得分:0)

看起来你的Namenode没有运行,或者任何其他守护进程没有运行,也确保你可以在节点之间ping。