mapreduce成功后,spark应用程序失败

时间:2019-04-29 17:01:21

标签: apache-spark hadoop mapreduce oozie hue

我正在Hue Workflow(Oozie)(AWS EMR)中进行Spark应用程序

Hadoop EMR 2.7.3
Hive 2.3.0
Hue 3.12.0
Spark 2.2.0

作业的执行时间约为3小时30分钟。

火花应用程序作业操作成功。

但是,当JobHistoryEventHandler将MRAppMaster的作业历史事件直接写入登台目录中的DFS,然后将其移动到done-dir时,会发生错误。

我在创建SparkSession时设置了以下选项,但出现相同的错误。

.config("spark.speculation", "false")
.config("spark.hadoop.mapreduce.map.speculative", "false")
.config("spark.hadoop.mapreduce.reduce.speculative", "false")

在完全写入JobHistory日志之前,似乎已删除该文件。

有没有人可以帮助我?

请。

2019-04-25 17:56:37,688 [uber-SubtaskRunner] INFO  org.apache.spark.scheduler.cluster.SchedulerExtensionServices  - Stopping SchedulerExtensionServices
(serviceOption=None,
 services=List(),
 started=false)
2019-04-25 17:56:37,689 [uber-SubtaskRunner] INFO  org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend  - Stopped
2019-04-25 17:56:37,696 [dispatcher-event-loop-11] INFO  org.apache.spark.MapOutputTrackerMasterEndpoint  - MapOutputTrackerMasterEndpoint stopped!
2019-04-25 17:56:37,722 [uber-SubtaskRunner] INFO  org.apache.spark.storage.memory.MemoryStore  - MemoryStore cleared
2019-04-25 17:56:37,723 [uber-SubtaskRunner] INFO  org.apache.spark.storage.BlockManager  - BlockManager stopped
2019-04-25 17:56:37,728 [uber-SubtaskRunner] INFO  org.apache.spark.storage.BlockManagerMaster  - BlockManagerMaster stopped
2019-04-25 17:56:37,732 [dispatcher-event-loop-3] INFO  org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint  - OutputCommitCoordinator stopped!
2019-04-25 17:56:37,734 [uber-SubtaskRunner] INFO  org.apache.spark.SparkContext  - Successfully stopped SparkContext

<<< Invocation of Spark command completed <<<

Hadoop Job IDs executed by Spark: job_1555401240260_3212


<<< Invocation of Main class completed <<<

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://ip-11-200-20-303.ap-northeast-2.compute.internal:8020/user/admin/oozie-oozi/0005534-190307151834967-oozie-oozi-W/spark-e891--spark/action-data.seq
2019-04-25 17:56:39,498 [uber-SubtaskRunner] INFO  org.apache.hadoop.io.compress.zlib.ZlibFactory  - Successfully loaded & initialized native-zlib library
2019-04-25 17:56:39,499 [uber-SubtaskRunner] INFO  org.apache.hadoop.io.compress.CodecPool  - Got brand-new compressor [.deflate]
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager@12b1c419 to null

Oozie Launcher ends

2019-04-25 17:56:39,553 [uber-SubtaskRunner] INFO  org.apache.hadoop.mapred.TaskAttemptListenerImpl  - Progress of TaskAttempt attempt_1555401240260_3211_m_000000_0 is : 1.0
2019-04-25 17:56:39,562 [uber-SubtaskRunner] INFO  org.apache.hadoop.mapred.Task  - Task:attempt_1555401240260_3211_m_000000_0 is done. And is in the process of committing
2019-04-25 17:56:39,597 [uber-SubtaskRunner] INFO  org.apache.hadoop.mapred.TaskAttemptListenerImpl  - Progress of TaskAttempt attempt_1555401240260_3211_m_000000_0 is : 1.0
2019-04-25 17:56:39,597 [uber-SubtaskRunner] INFO  org.apache.hadoop.mapred.TaskAttemptListenerImpl  - Done acknowledgement from attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,598 [uber-SubtaskRunner] INFO  org.apache.hadoop.mapred.Task  - Task 'attempt_1555401240260_3211_m_000000_0' done.
2019-04-25 17:56:39,600 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl  - attempt_1555401240260_3211_m_000000_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP
2019-04-25 17:56:39,601 [uber-EventHandler] INFO  org.apache.hadoop.mapred.LocalContainerLauncher  - Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1555401240260_3211_01_000001 taskAttempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,601 [uber-EventHandler] INFO  org.apache.hadoop.mapred.LocalContainerLauncher  - canceling the task attempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,621 [uber-SubtaskRunner] WARN  org.apache.hadoop.mapred.LocalContainerLauncher  - Unable to delete unexpected local file/dir .action.xml.crc: insufficient permissions?
2019-04-25 17:56:39,623 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl  - attempt_1555401240260_3211_m_000000_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2019-04-25 17:56:39,633 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl  - Task succeeded with attempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,635 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl  - task_1555401240260_3211_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2019-04-25 17:56:39,638 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl  - Num completed Tasks: 1
2019-04-25 17:56:39,639 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl  - job_1555401240260_3211Job Transitioned from RUNNING to COMMITTING
2019-04-25 17:56:39,639 [CommitterEvent Processor #1] INFO  org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler  - Processing the event EventType: JOB_COMMIT
2019-04-25 17:56:39,642 [Thread-85] WARN  org.apache.hadoop.hdfs.DFSClient  - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on 
 (inode 1387896): File does not exist. Holder DFSClient_NONMAPREDUCE_-2005393980_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3432)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3233)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3071)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)

    at org.apache.hadoop.ipc.Client.call(Client.java:1475)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
    at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
2019-04-25 17:56:39,643 [eventHandlingThread] WARN  org.apache.hadoop.hdfs.DFSClient  - Error while syncing
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /tmp/hadoop-yarn/staging/admin/.staging/job_1555401240260_3211/job_1555401240260_3211_1.jhist (inode 1387896): File does not exist. Holder DFSClient_NONMAPREDUCE_-2005393980_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3432)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3233)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3071)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)

    at org.apache.hadoop.ipc.Client.call(Client.java:1475)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
    at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)

2019-04-25 17:56:39,647 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl  - Calling handler for JobFinishedEvent 
2019-04-25 17:56:39,648 [AsyncDispatcher event handler] INFO  org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl  - job_1555401240260_3211Job Transitioned from COMMITTING to SUCCEEDED
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.v2.app.MRAppMaster  - We are finishing cleanly so this is the last retry
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.v2.app.MRAppMaster  - Notify RMCommunicator isAMLastRetry: true
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator  - RMCommunicator notified that shouldUnregistered is: true
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.v2.app.MRAppMaster  - Notify JHEH isAMLastRetry: true
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler  - JobHistoryEventHandler notified that forceJobCompletion is true
2019-04-25 17:56:39,649 [Thread-953] INFO  org.apache.hadoop.mapreduce.v2.app.MRAppMaster  - Calling stop for all the services
2019-04-25 17:56:39,650 [Thread-953] INFO  org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler  - Stopping JobHistoryEventHandler. Size of the outstanding queue size is 2
2019-04-25 17:56:39,650 [Thread-953] INFO  org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler  - In stop, writing event TASK_FINISHED
2019-04-25 17:56:39,653 [Thread-953] ERROR org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler  - Error writing History Event: org.apache.hadoop.mapreduce.jobhistory.TaskFinishedEvent@31ccabfb

0 个答案:

没有答案