Hive作业发生mapreduce错误:从连接异常调用从hmaster / 127.0.0.1到localhost:44849失败

时间:2014-12-02 10:21:23

标签: hadoop mapreduce hive hql

当我在hive命令行中运行时:

hive > select count(*) from alogs;

在终端上,它显示以下内容:

    Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1417084377943_0009, Tracking URL = http://localhost:8088/proxy/application_1417084377943_0009/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1417084377943_0009
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-12-02 17:59:44,068 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_1417084377943_0009 with errors
Error during job, obtaining debugging information...
**FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask**
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

然后我使用resourceManager查看错误详细信息:

Application application_1417084377943_0009 failed 2 times due to Error launching appattempt_1417084377943_0009_000002. Got exception: **java.net.ConnectException: Call From hmaster/127.0.0.1 to localhost:44849 failed on connection exception: java.net.ConnectException: Connection refused;** For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
    at org.apache.hadoop.ipc.Client.call(Client.java:1415)
    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy32.startContainers(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:712)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
    at org.apache.hadoop.ipc.Client.call(Client.java:1382)
    ... 9 more
    . Failing the application. 

虽然错误通知足够详细,但我不知道在哪里设置配置'localhost:44849','呼叫从hmaster / 127.0.0.1到localhost:44849在连接异常时失败'是什么意思'

2 个答案:

答案 0 :(得分:0)

当我运行弹簧纱编写的应用程序时,我遇到了同样的问题。 我可以找到一个解决方案并多次测试纱线应用,但没有得到这个错误。

首先,修改所有其他服务器的/ etc / hosts并在文件中写入所有从属服务器,如:

192.168.0.101 slave1
192.168.0.102 slave2
...

其次,修改/ home / user / hadoop / etc / hadoop /中的所有ther服务器的yarn-site.xml 并添加如下属性:

  <property>
    <name>yarn.nodemanager.address</name>
    <value>slave1:57799</value>
  </property>

请注意,域名必须与服务器和端口相同,您可以设置randorm编号,例如57799.端口号必须在所有yarn-site.xml文件中保持一致。

第三,重新启动resourcemanager和所有nodemanagers。

我希望这对你有所帮助。

此外,我认为这个问题是因为我没有在文件中添加奴隶列表

/home/user/hadoop/etc/hadoop/slaves 

但我没有测试过。

答案 1 :(得分:0)

如果你的hadoop安装文件中有配置文件“.... / hadoop-2.8.1 / etc / hadoop / mapred-site.xml”,并且你没有运行YARN,则hive任务可能会抛出重试连接到服务器:0.0.0.0/0.0.0.0:8032“异常。 (你可能会发现select *没问题,选择sum()是错误的,┭┮﹏┭┮)

你可以执行“jps”来检查YARN是否正在运行。

如果YARN没有运行,结果可能是:

[cc@localhost conf]$ jps
36721 Jps
8402 DataNode
35458 RunJar
8659 SecondaryNameNode
8270 NameNode

如果YARN正在运行,结果可能如下:

[cc@localhost sbin]$ jps
13237 Jps
9767 DataNode
9975 SecondaryNameNode
12651 ResourceManager (多了这个)
12956 NodeManager (多了这个)
9581 NameNode
13135 JobHistoryServer

有两种解决方案:

1.重命名mapred-site.xml文件,执行linux命令“mv mapred-site.xml mapred-site.xml.template”或删除mapred-site.xml文件,然后重新启动hadoop。

2.run YARN。 ps:修改hadoop配置并使用“start-yarn.sh”运行YARN。