Spark master不断为不存在的驱动程序启动执行程序

时间:2018-03-09 19:48:59

标签: apache-spark apache-spark-standalone

Spark应用程序以独立群集模式部署,并启用了监督。

在高可用性测试期间,当一个带有驱动程序实例的机架关闭电源(非正常)时,spark master不知道被杀死的驱动程序和应用程序,并且master会持续为应用程序启动执行程序大约15分钟。

masterlogs(以下记录15分钟)

2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Launching executor app-20180309175053-0002/5202 on worker worker-20180309171520-10.247.247.191-51426 2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Removing executor app-20180309175053-0002/5153 because it is EXITED 2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Launching executor app-20180309175053-0002/5203 on worker worker-20180309171632-10.247.247.156-57784 2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Removing executor app-20180309175053-0002/5155 because it is EXITED 2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Launching executor app-20180309175053-0002/5204 on worker worker-20180309123802-10.247.247.121-45652 2018-03-09 18:09:02 INFO org.apache.spark.internal.Logging$class:54 - Removing executor app-20180309175053-0002/5157 because it is EXITED

第15分钟后

2018-03-09 18:09:16 WARN org.apache.spark.internal.Logging$class:66 - Got status update for unknown executor app-20180309175053-0002/5282 2018-03-09 18:09:16 WARN org.apache.spark.internal.Logging$class:66 - Got status update for unknown executor app-20180309175053-0002/5295 2018-03-09 18:09:16 WARN org.apache.spark.internal.Logging$class:66 - Got status update for unknown executor app-20180309175053-0002/5296 2018-03-09 18:09:16 WARN org.apache.spark.internal.Logging$class:66 - Got status update for unknown executor app-20180309175053-0002/5289 2018-03-09 18:09:16 WARN org.apache.spark.internal.Logging$class:66 - Got status update for unknown executor app-20180309175053-0002/5277

执行者日志 2018-03-09 18:50:17 INFO org.apache.spark.internal.Logging$class:54 - Asked to kill executor app-20180309180931-0004/50 2018-03-09 18:50:17 INFO org.apache.spark.internal.Logging$class:54 - Runner thread for executor app-20180309180931-0004/50 interrupted 2018-03-09 18:50:17 INFO org.apache.spark.internal.Logging$class:54 - Killing process! 2018-03-09 18:50:17 INFO org.apache.spark.internal.Logging$class:54 - Executor app-20180309180931-0004/50 finished with state KILLED exitStatus 143

我检查了spark master代码,在那里找不到多少。

感谢任何帮助,谢谢。

0 个答案:

没有答案