为什么我得到执行者拒绝连接?

时间:2019-07-13 07:08:49

标签: apache-spark pyspark

我在执行代码时有些奇怪:

当我执行以下行时:

sourceList = joinLabelrdd_df.select("x").collect()

我得到以下执行。注意我有足够的内存和CPU。

 19/07/14 11:22:34 ERROR TaskSchedulerImpl: Lost executor 5 on 172.16.140.68: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
19/07/14 11:22:34 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20190714111835-0000/5 is now EXITED (Command exited with code 137)

此错误导致另一个异常:

19/07/14 11:22:41 WARN TaskSetManager: Lost task 113.1 in stage 9.0 (TID 2154, 172.16.140.113, executor 9): FetchFailed(null, shuffleId=0, mapId=-1, reduceId=113, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

0 个答案:

没有答案