Hive Job Killed - 由同行重置IPC连接

时间:2016-08-18 13:39:33

标签: hadoop mapreduce hive hdfs hadoop2

我的hive脚本在执行许多查询后在特定点失败了许多查询。

Vanilla Hive版本:1.2.1
执行引擎:mapreduce

在日志中,我可以通过其中一个数据节点看到连接重置。 此问题有时会发生,有时会出现此错误,数据节点将重新启动。

它看起来只是一个网络问题吗?

这可能是与内存相关的问题吗?我正在使用hive.auto.convert.join.noconditionaltask = true。 它会导致网络流量过大吗?

以下是日志片段,请提前感谢!!

 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_183782727_8787_m_000000_1 TaskAttempt Transitioned from NEW to UNASSIGNED
 INFO [Thread-50] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node node3
 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_876576567_8787_m_000001_1 TaskAttempt Transitioned from NEW to UNASSIGNED
 INFO [Thread-50] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_8787483787847_9124_m_000000_1 to list of failed maps
 INFO [Thread-50] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_8787483787847_9124_m_000000_1 to list of failed maps
 INFO [Socket Reader #1 for port 46764] org.apache.hadoop.ipc.Server: Socket Reader #1 for port 46764: readAndProcess from client 160.43.98.11 threw exception [java.io.IOException: Connection reset by peer]
java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
 at org.apache.hadoop.ipc.Server.channelRead(Server.java:2603)
 at org.apache.hadoop.ipc.Server.access$2800(Server.java:136)
 at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1481)
 at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)
 at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)
 at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608)
 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_8787483787847_9124: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:146608, vCores:1> knownNMs=5
 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_8787483787847_9124_01_000002
 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_8787483787847_9124_01_000003
 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:1 RackLocal:1
 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_8787483787847_9124_m_000000_1: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_8787483787847_9124_m_000000_1: Container killed by the ApplicationMaster.
INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 2
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

0 个答案:

没有答案