我正在尝试使用使用yarn-cluster作为主人的spark-action来运行OOZIE工作流程。当我从命令行使用spark-submit时,应用似乎运行正常:
./bin/spark-submit --verbose --class com.demo.spark.hdfs.SparkHDFSDemo --master yarn-cluster hdfs://namenode:8020/user/root/word-count/spark-hdfs.jar
但是当我尝试使用spark-action运行相同的应用程序时,它会持续运行大约20分钟然后被杀死。以下是我的job.properties:
nameNode=hdfs://namenode:8020
jobTracker=job-tracker:8050
master=yarn-cluster
name=spark-hdfs-word-count
class=com.demo.spark.hdfs.SparkHDFSDemo
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/root/word-count/spark-hdfs-workflow.xml
jar=${nameNode}/user/root/word-count/spark-hdfs.jar
以下是我的spark-hdfs-workflow.xml
<workflow-app name="word-count-001" xmlns="uri:oozie:workflow:0.1">
<start to="spark-hdfs" />
<action name="spark-hdfs">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>${master}</master>
<name>${name}</name>
<class>${class}</class>
<jar>${jar}</jar>
</spark>
<ok to="end" />
<error to="kill" />
</action>
<kill name="kill">
<message>failed</message>
</kill>
<end name="end" />
</workflow-app>
我还检查了我的HDFS共享lib内容,他们确实有spark-assembly.jar和其他jar。以下是从YARN资源管理器UI获取的日志:
2015-08-30 05:09:57,616 INFO [ContainerLauncher #0] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : namenode:45454
2015-08-30 05:09:57,664 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1440917812423_0018_m_000000_0 : 13562
2015-08-30 05:09:57,665 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1440917812423_0018_m_000000_0] using containerId: [container_e06_1440917812423_0018_01_000002 on NM: [namenode:45454]
2015-08-30 05:09:57,667 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1440917812423_0018_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2015-08-30 05:09:57,667 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1440917812423_0018_m_000000 Task Transitioned from SCHEDULED to RUNNING
2015-08-30 05:09:58,264 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1440917812423_0018: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:70656, vCores:1> knownNMs=6
2015-08-30 05:09:58,993 INFO [Socket Reader #1 for port 60060] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1440917812423_0018 (auth:SIMPLE)
2015-08-30 05:09:59,014 INFO [IPC Server handler 0 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1440917812423_0018_m_6597069766658 asked for a task
2015-08-30 05:09:59,014 INFO [IPC Server handler 0 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1440917812423_0018_m_6597069766658 given task: attempt_1440917812423_0018_m_000000_0
2015-08-30 05:10:05,741 INFO [IPC Server handler 1 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:10:35,789 INFO [IPC Server handler 15 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:11:05,829 INFO [IPC Server handler 22 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:11:35,868 INFO [IPC Server handler 14 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:12:05,898 INFO [IPC Server handler 13 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:12:35,932 INFO [IPC Server handler 8 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:13:05,979 INFO [IPC Server handler 18 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:13:36,010 INFO [IPC Server handler 27 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:14:03,046 INFO [IPC Server handler 16 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:14:33,073 INFO [IPC Server handler 10 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:15:03,117 INFO [IPC Server handler 18 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:15:33,160 INFO [IPC Server handler 6 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:16:03,227 INFO [IPC Server handler 23 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:16:33,254 INFO [IPC Server handler 29 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:17:03,292 INFO [IPC Server handler 22 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:17:33,328 INFO [IPC Server handler 1 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:18:03,374 INFO [IPC Server handler 6 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:18:33,445 INFO [IPC Server handler 29 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:19:03,478 INFO [IPC Server handler 12 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:19:33,513 INFO [IPC Server handler 4 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:20:03,568 INFO [IPC Server handler 26 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:20:33,592 INFO [IPC Server handler 1 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:21:03,624 INFO [IPC Server handler 23 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:21:33,652 INFO [IPC Server handler 11 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:22:03,680 INFO [IPC Server handler 22 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:22:33,705 INFO [IPC Server handler 18 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:23:03,735 INFO [IPC Server handler 13 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:23:33,761 INFO [IPC Server handler 16 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:24:03,801 INFO [IPC Server handler 11 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:24:33,822 INFO [IPC Server handler 8 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:25:03,846 INFO [IPC Server handler 22 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:25:33,873 INFO [IPC Server handler 2 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:26:03,909 INFO [IPC Server handler 20 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:26:33,931 INFO [IPC Server handler 11 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:27:03,964 INFO [IPC Server handler 16 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:27:33,999 INFO [IPC Server handler 29 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:28:04,029 INFO [IPC Server handler 8 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:28:34,048 INFO [IPC Server handler 20 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:29:04,072 INFO [IPC Server handler 4 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:29:34,097 INFO [IPC Server handler 29 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:30:04,131 INFO [IPC Server handler 6 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:30:11,010 INFO [IPC Server handler 17 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:30:11,027 INFO [IPC Server handler 25 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,028 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1440917812423_0018_m_000000_0 TaskAttempt Transitioned from RUNNING to COMMIT_PENDING
2015-08-30 05:30:11,028 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: attempt_1440917812423_0018_m_000000_0 given a go for committing the task output.
2015-08-30 05:30:11,029 INFO [IPC Server handler 14 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit go/no-go request from attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,029 INFO [IPC Server handler 14 on 60060] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Result of canCommit for attempt_1440917812423_0018_m_000000_0:true
2015-08-30 05:30:11,050 INFO [IPC Server handler 15 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1440917812423_0018_m_000000_0 is : 1.0
2015-08-30 05:30:11,052 INFO [IPC Server handler 26 on 60060] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,053 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1440917812423_0018_m_000000_0 TaskAttempt Transitioned from COMMIT_PENDING to SUCCESS_CONTAINER_CLEANUP
2015-08-30 05:30:11,054 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_e06_1440917812423_0018_01_000002 taskAttempt attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,054 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,055 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : namenode:45454
2015-08-30 05:30:11,071 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1440917812423_0018_m_000000_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2015-08-30 05:30:11,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1440917812423_0018_m_000000_0
2015-08-30 05:30:11,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1440917812423_0018_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2015-08-30 05:30:11,082 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2015-08-30 05:30:11,082 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1440917812423_0018Job Transitioned from RUNNING to COMMITTING
2015-08-30 05:30:11,083 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_COMMIT
2015-08-30 05:30:11,115 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for JobFinishedEvent
2015-08-30 05:30:11,115 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1440917812423_0018Job Transitioned from COMMITTING to SUCCEEDED
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: We are finishing cleanly so this is the last retry
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator isAMLastRetry: true
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: RMCommunicator notified that shouldUnregistered is: true
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: true
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: JobHistoryEventHandler notified that forceJobCompletion is true
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the services
2015-08-30 05:30:11,116 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 1
2015-08-30 05:30:11,119 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event JOB_FINISHED
2015-08-30 05:30:11,162 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://namenode:8020/user/root/.staging/job_1440917812423_0018/job_1440917812423_0018_1.jhist to hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018-1440925786804-root-oozie%3Alauncher%3AT%3Dspark%3AW%3Dword%2Dcount%2D001%3AA%3Dspark%2Dhd-1440927011114-1-0-SUCCEEDED-default-1440925795210.jhist_tmp
2015-08-30 05:30:11,187 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018-1440925786804-root-oozie%3Alauncher%3AT%3Dspark%3AW%3Dword%2Dcount%2D001%3AA%3Dspark%2Dhd-1440927011114-1-0-SUCCEEDED-default-1440925795210.jhist_tmp
2015-08-30 05:30:11,189 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://namenode:8020/user/root/.staging/job_1440917812423_0018/job_1440917812423_0018_1_conf.xml to hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018_conf.xml_tmp
2015-08-30 05:30:11,211 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018_conf.xml_tmp
2015-08-30 05:30:11,214 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018.summary_tmp to hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018.summary
2015-08-30 05:30:11,216 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018_conf.xml_tmp to hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018_conf.xml
2015-08-30 05:30:11,217 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018-1440925786804-root-oozie%3Alauncher%3AT%3Dspark%3AW%3Dword%2Dcount%2D001%3AA%3Dspark%2Dhd-1440927011114-1-0-SUCCEEDED-default-1440925795210.jhist_tmp to hdfs://namenode:8020/mr-history/tmp/root/job_1440917812423_0018-1440925786804-root-oozie%3Alauncher%3AT%3Dspark%3AW%3Dword%2Dcount%2D001%3AA%3Dspark%2Dhd-1440927011114-1-0-SUCCEEDED-default-1440925795210.jhist
2015-08-30 05:30:11,217 INFO [Thread-105] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2015-08-30 05:30:11,219 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to
2015-08-30 05:30:11,219 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: History url is http://namenode:19888/jobhistory/job/job_1440917812423_0018
2015-08-30 05:30:11,224 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Waiting for application to be successfully unregistered.
2015-08-30 05:30:12,226 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2015-08-30 05:30:12,228 INFO [Thread-105] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://namenode:8020 /user/root/.staging/job_1440917812423_0018
2015-08-30 05:30:12,231 INFO [Thread-105] org.apache.hadoop.ipc.Server: Stopping server on 60060
2015-08-30 05:30:12,246 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2015-08-30 05:30:12,246 INFO [IPC Server listener on 60060] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 60060
2015-08-30 05:30:12,246 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
我从作业历史记录服务器UI中找到了这些日志。不知道为什么它尝试连接到端口8032上的localhost,因为应用程序在6个节点的集群上运行。
Error: application failed with exception
java.net.ConnectException: Call From ip-namenode.internal/namenode to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1431)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy17.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:506)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:619)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
无法从这些日志中获取太多信息,因为它没有显示任何错误但是作业运行然后在没有任何错误日志的情况下被杀死。完全失败的原因可能是任何帮助都会得到很多赞赏。