我正在Hue Workflow(Oozie)(AWS EMR)中进行Spark应用程序
Hadoop EMR 2.7.3
Hive 2.3.0
Hue 3.12.0
Spark 2.2.0
作业的执行时间约为3小时30分钟。
火花应用程序作业操作成功。
但是,当JobHistoryEventHandler将MRAppMaster的作业历史事件直接写入登台目录中的DFS,然后将其移动到done-dir时,会发生错误。
我在创建SparkSession时设置了以下选项,但出现相同的错误。
.config("spark.speculation", "false")
.config("spark.hadoop.mapreduce.map.speculative", "false")
.config("spark.hadoop.mapreduce.reduce.speculative", "false")
在完全写入JobHistory日志之前,似乎已删除该文件。
有没有人可以帮助我?
请。
2019-04-25 17:56:37,688 [uber-SubtaskRunner] INFO org.apache.spark.scheduler.cluster.SchedulerExtensionServices - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
2019-04-25 17:56:37,689 [uber-SubtaskRunner] INFO org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend - Stopped
2019-04-25 17:56:37,696 [dispatcher-event-loop-11] INFO org.apache.spark.MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
2019-04-25 17:56:37,722 [uber-SubtaskRunner] INFO org.apache.spark.storage.memory.MemoryStore - MemoryStore cleared
2019-04-25 17:56:37,723 [uber-SubtaskRunner] INFO org.apache.spark.storage.BlockManager - BlockManager stopped
2019-04-25 17:56:37,728 [uber-SubtaskRunner] INFO org.apache.spark.storage.BlockManagerMaster - BlockManagerMaster stopped
2019-04-25 17:56:37,732 [dispatcher-event-loop-3] INFO org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped!
2019-04-25 17:56:37,734 [uber-SubtaskRunner] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext
<<< Invocation of Spark command completed <<<
Hadoop Job IDs executed by Spark: job_1555401240260_3212
<<< Invocation of Main class completed <<<
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://ip-11-200-20-303.ap-northeast-2.compute.internal:8020/user/admin/oozie-oozi/0005534-190307151834967-oozie-oozi-W/spark-e891--spark/action-data.seq
2019-04-25 17:56:39,498 [uber-SubtaskRunner] INFO org.apache.hadoop.io.compress.zlib.ZlibFactory - Successfully loaded & initialized native-zlib library
2019-04-25 17:56:39,499 [uber-SubtaskRunner] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate]
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager@12b1c419 to null
Oozie Launcher ends
2019-04-25 17:56:39,553 [uber-SubtaskRunner] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Progress of TaskAttempt attempt_1555401240260_3211_m_000000_0 is : 1.0
2019-04-25 17:56:39,562 [uber-SubtaskRunner] INFO org.apache.hadoop.mapred.Task - Task:attempt_1555401240260_3211_m_000000_0 is done. And is in the process of committing
2019-04-25 17:56:39,597 [uber-SubtaskRunner] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Progress of TaskAttempt attempt_1555401240260_3211_m_000000_0 is : 1.0
2019-04-25 17:56:39,597 [uber-SubtaskRunner] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Done acknowledgement from attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,598 [uber-SubtaskRunner] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1555401240260_3211_m_000000_0' done.
2019-04-25 17:56:39,600 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl - attempt_1555401240260_3211_m_000000_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP
2019-04-25 17:56:39,601 [uber-EventHandler] INFO org.apache.hadoop.mapred.LocalContainerLauncher - Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1555401240260_3211_01_000001 taskAttempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,601 [uber-EventHandler] INFO org.apache.hadoop.mapred.LocalContainerLauncher - canceling the task attempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,621 [uber-SubtaskRunner] WARN org.apache.hadoop.mapred.LocalContainerLauncher - Unable to delete unexpected local file/dir .action.xml.crc: insufficient permissions?
2019-04-25 17:56:39,623 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl - attempt_1555401240260_3211_m_000000_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2019-04-25 17:56:39,633 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl - Task succeeded with attempt attempt_1555401240260_3211_m_000000_0
2019-04-25 17:56:39,635 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl - task_1555401240260_3211_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2019-04-25 17:56:39,638 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl - Num completed Tasks: 1
2019-04-25 17:56:39,639 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl - job_1555401240260_3211Job Transitioned from RUNNING to COMMITTING
2019-04-25 17:56:39,639 [CommitterEvent Processor #1] INFO org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler - Processing the event EventType: JOB_COMMIT
2019-04-25 17:56:39,642 [Thread-85] WARN org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on
(inode 1387896): File does not exist. Holder DFSClient_NONMAPREDUCE_-2005393980_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3432)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3233)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3071)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
2019-04-25 17:56:39,643 [eventHandlingThread] WARN org.apache.hadoop.hdfs.DFSClient - Error while syncing
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /tmp/hadoop-yarn/staging/admin/.staging/job_1555401240260_3211/job_1555401240260_3211_1.jhist (inode 1387896): File does not exist. Holder DFSClient_NONMAPREDUCE_-2005393980_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3432)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3233)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3071)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
2019-04-25 17:56:39,647 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl - Calling handler for JobFinishedEvent
2019-04-25 17:56:39,648 [AsyncDispatcher event handler] INFO org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl - job_1555401240260_3211Job Transitioned from COMMITTING to SUCCEEDED
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.v2.app.MRAppMaster - We are finishing cleanly so this is the last retry
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.v2.app.MRAppMaster - Notify RMCommunicator isAMLastRetry: true
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator - RMCommunicator notified that shouldUnregistered is: true
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.v2.app.MRAppMaster - Notify JHEH isAMLastRetry: true
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler - JobHistoryEventHandler notified that forceJobCompletion is true
2019-04-25 17:56:39,649 [Thread-953] INFO org.apache.hadoop.mapreduce.v2.app.MRAppMaster - Calling stop for all the services
2019-04-25 17:56:39,650 [Thread-953] INFO org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler - Stopping JobHistoryEventHandler. Size of the outstanding queue size is 2
2019-04-25 17:56:39,650 [Thread-953] INFO org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler - In stop, writing event TASK_FINISHED
2019-04-25 17:56:39,653 [Thread-953] ERROR org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler - Error writing History Event: org.apache.hadoop.mapreduce.jobhistory.TaskFinishedEvent@31ccabfb