您好我已经安装了Cloudera Manager.While转向我的基准测试map-reduce任务失败说下面的错误消息:
16/04/01 12:42:40 INFO mapreduce.Job: Job job_1459494626924_0001 running in uber mode : false
16/04/01 12:42:40 INFO mapreduce.Job: map 0% reduce 0%
16/04/01 12:42:54 INFO mapreduce.Job: map 16% reduce 0%
16/04/01 12:42:55 INFO mapreduce.Job: map 29% reduce 0%
16/04/01 12:42:56 INFO mapreduce.Job: map 75% reduce 0%
16/04/01 12:42:57 INFO mapreduce.Job: map 83% reduce 0%
16/04/01 12:42:59 INFO mapreduce.Job: map 100% reduce 0%
16/04/01 12:43:01 INFO mapreduce.Job: Task Id : attempt_1459494626924_0001_r_000000_0, Status : FAILED
Exception from container-launch.
Container id: container_1459494626924_0001_01_000010
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
16/04/01 12:43:01 INFO mapreduce.Job: Task Id : attempt_1459494626924_0001_r_000005_0, Status : FAILED
Exception from container-launch.
Container id: container_1459494626924_0001_01_000015
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
16/04/01 12:43:01 INFO mapreduce.Job: Task Id : attempt_1459494626924_0001_r_000001_0, Status : FAILED
Exception from container-launch.
Container id: container_1459494626924_0001_01_000011
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
16/04/01 12:43:01 INFO mapreduce.Job: Task Id : attempt_1459494626924_0001_r_000006_0, Status : FAILED
Exception from container-launch.
Container id: container_1459494626924_0001_01_000016
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
16/04/01 12:43:01 INFO mapreduce.Job: Task Id : attempt_1459494626924_0001_r_000004_0, Status : FAILED
Exception from container-launch.
Container id: container_1459494626924_0001_01_000014
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
可能需要检查的问题。我已经使用cloudera提供的excel表进行了YARN调整。我还尝试调整Namenode的Vcores和内存。
在日志文件中,我可以看到以下错误消息
2016-03-30 22:04:27,404 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn IP=10.195.48.127 OPERATION=refreshNodes TARGET=AdminService RESULT=SUCCESS
2016-03-30 17:38:24,133 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: RECEIVED SIGNAL 15: SIGTERM
2016-03-30 17:38:24,148 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-03-30 17:38:24,151 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@slave5:8088
2016-03-30 17:38:24,151 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-03-30 17:38:24,152 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-03-30 17:38:24,254 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
2016-03-30 17:38:24,257 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
2016-03-30 17:38:24,257 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-03-30 17:38:24,257 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033
2016-03-30 17:38:24,260 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033
2016-03-30 17:38:24,260 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-03-30 17:38:24,263 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type STATUS_UPDATE for node slave2:8041
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:247)
at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:778)
at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:736)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:418)
at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:79)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:866)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:850)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:174)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:242)
... 13 more
2016-03-30 17:38:24,283 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
2016-03-30 17:38:24,283 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ResourceManager metrics system...
2016-03-30 17:38:24,284 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system stopped.
2016-03-30 17:38:24,285 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system shutdown complete.
2016-03-30 17:38:24,285 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher is draining to stop, igonring any new events.
有人可以提出什么问题?