Databricks笔记本在内存作业时崩溃

时间:2020-06-09 04:45:47

标签: azure pyspark databricks azure-databricks

我正在执行很少的操作来在Azure数据块上聚合大量数据(约600GB)。我最近注意到笔记本电脑崩溃了,数据块返回了以下错误。在较小的6个节点群集之前,相同的代码也可以使用。将其升级到12个节点后,我开始得到它,并且我怀疑这是配置问题。

请帮忙,我使用默认的Spark配置,分区号为200,节点上有88个执行程序。


Thanks
Internal error, sorry. Attach your notebook to a different cluster or restart the current cluster.
java.lang.RuntimeException: abort: DriverClient destroyed
    at com.databricks.backend.daemon.driver.DriverClient.$anonfun$poll$3(DriverClient.scala:381)
    at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:307)
    at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
    at com.databricks.threading.NamedExecutor$$anon$2.$anonfun$run$1(NamedExecutor.scala:335)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
    at com.databricks.threading.NamedExecutor.withAttributionContext(NamedExecutor.scala:265)
    at com.databricks.threading.NamedExecutor$$anon$2.run(NamedExecutor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

1 个答案:

答案 0 :(得分:1)

我不确定成本的含义,但是如何在群集上启用自动扩展选项并增加Max Workers。您也可以尝试更改工作人员类型以拥有更好的资源

enter image description here