由于OutOfMemoryError导致DataFrame广播失败时,Apache Spark驱动程序将挂起

时间:2018-12-25 09:53:43

标签: multithreading scala apache-spark apache-spark-sql

我试图广播一个事实证明它大于spark.sql.autoBroadcastJoinThreshold的数据帧,并且驱动程序已登录

Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError Not enough memory to build and broadcast the table to all worker nodes. As a workaround, you can...

但是,应用程序只是挂起并且驱动程序停留在以下位置,而不是返回到Driver线程并失败:

sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:136)
org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:367)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:144)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:140)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
...
...
由于我们遇到的其他历史问题,

spark.sql.broadcastTimeout设置得很高,实际上,驱动程序最终在超时时失败了,但是我仍然想知道这是否是预期的行为?我试图绕过ThreadUtils.awaitResult,但找不到(明确地)期望这是行为的证据。

任何人都可以确认这不是错误吗?

0 个答案:

没有答案