由于“FileSystem关闭”异常,Spark应用程序失败

时间:2017-08-21 05:26:19

标签: apache-spark

我的spark应用程序因“FileSystem关闭”异常而失败。典型的堆栈跟踪最后附加。我做了一些研究,这表明执行者已经关闭(见this post)。

这可能是由于大的混乱或内存不足异常而发生的,但我在纱线日志中找不到任何这些。

我的问题是:

  1. 如果发生了这些异常,如果不在纱线日志中,我会在哪里找到它们?
  2. 如果这些例外从未发生过,除了调查纱线日志外我该怎么办?
  3. 谢谢!

    这是一个纱线日志片段:

    2017-08-21 01:55:10,668 ERROR  org.apache.spark.scheduler.LiveListenerBus: Listener EventLoggingListener threw an exception
    java.io.IOException: Filesystem closed
            at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:837)
            at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2170)
            at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:2116)
            at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
            at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at com.pepperdata.common.reflect.b.b(SourceFile:149)
            at com.pepperdata.common.reflect.b.c(SourceFile:205)
            at com.pepperdata.supervisor.agent.resource.T.b(SourceFile:102)
            at com.pepperdata.supervisor.agent.resource.I.hflush(SourceFile:61)
            at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:140)
            at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:140)
            at scala.Option.foreach(Option.scala:236)
            at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:140)
            at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:163)
            at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37)
            at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
            at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
            at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
            at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)
            at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)
            at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
            at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
            at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
            at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
            at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
    

1 个答案:

答案 0 :(得分:1)

我刚刚发现问题的原因是同一个FileSystem意外地多次关闭()。不调用close()会使异常消失。