Question

我在本地开发了一个Spark应用程序并没有遇到任何问题。但是当我想在Docker Image的Yarn Cluster中推送它时，我收到了以下消息：

线程“main”中的异常org.apache.spark.SparkException：作业因阶段失败而中止：阶段0.0中的任务2失败4次，最近失败：阶段0.0中失去的任务2.3（TID 26，沙箱）： ExecutorLostFailure（遗失执行人1）驱动程序堆栈跟踪：在org.apache.spark.scheduler.DAGScheduler.org $ apache $ spark $ scheduler $ DAGScheduler $$ failJobAndIndependentStages（DAGScheduler.scala：1203）在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.apply（DAGScheduler.scala：1192）在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.apply（DAGScheduler.scala：1191）在scala.collection.mutable.ResizableArray $ class.foreach（ResizableArray.scala：59）在scala.collection.mutable.ArrayBuffer.foreach（ArrayBuffer.scala：47）在org.apache.spark.scheduler.DAGScheduler.abortStage（DAGScheduler.scala：1191）在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.apply（DAGScheduler.scala：693）在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.apply（DAGScheduler.scala：693）在scala.Option.foreach（Option.scala：236）在org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed（DAGScheduler.scala：693）在org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive（DAGScheduler.scala：1393）在org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive（DAGScheduler.scala：1354）在org.apache.spark.util.EventLoop $$ anon $ 1.run（EventLoop.scala：48） uote

用于启动应用程序的命令是：

spark-submit --class myapp.myapp_spark.App --master yarn-client /opt/myapp/myapp_spark.jar

我的应用程序正在使用Mongo数据库。它是否与内存问题，与Mongo或其他东西的连接有关？提前致谢

org.apache.spark.SparkException：由于Yarn和Docker的阶段失败，作业中止

0 个答案: