在纱线群集模式下触发Spark-即使作业成功完成,纱线客户端也会报告FAILED

时间:2018-11-29 17:31:50

标签: apache-spark yarn

我正在尝试在纱线簇模式(v2.3.0)中运行Spark。传统上,我们一直在纱线客户端模式下运行,但是一些作业是从.NET Web服务提交的,因此在使用客户端模式(HostingEnvironment.QueueBackgroundWorkTime...时,我们必须使主机进程在后台运行。我们希望我们可以采用一种更加“抛弃式”的方式执行这些作业。

我们的作业继续成功运行,但是我们在日志中看到一个奇怪的条目,在该日志中,将作业提交给应用程序管理器的yarn客户端始终报告失败:

18/11/29 16:54:35 INFO yarn.Client: Application report for application_1539978346138_110818 (state: RUNNING)
18/11/29 16:54:36 INFO yarn.Client: Application report for application_1539978346138_110818 (state: RUNNING)
18/11/29 16:54:37 INFO yarn.Client: Application report for application_1539978346138_110818 (state: FINISHED)
18/11/29 16:54:37 INFO yarn.Client: 
     client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
     diagnostics: N/A
     ApplicationMaster host: <ip address>
     ApplicationMaster RPC port: 0
     queue: root.default
     start time: 1543510402372
     final status: FAILED
     tracking URL: http://server.host.com:8088/proxy/application_1539978346138_110818/
     user: p800s1
Exception in thread "main" org.apache.spark.SparkException: Application application_1539978346138_110818 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1153)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1568)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:892)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/11/29 16:54:37 INFO util.ShutdownHookManager: Shutdown hook called

我们总是创建一个SparkSession并总是返回sys.exit(0)(尽管无论我们如何提交作业,Spark框架似乎都会忽略它)。我们也有自己的内部错误日志记录,该日志记录路由到Kafka / ElasticSearch。在作业运行期间未报告任何错误。

以下是Submit命令的示例:spark2-submit --keytab /etc/keytabs/p800s1.ktf --principal p800s1@OURDOMAIN.COM --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 4g --class com.path.to.MainClass /path/to/UberJar.jar arg1 arg2

这似乎是无害的噪音,但我不喜欢我不理解的噪音。有没有人经历过类似的事情?

0 个答案:

没有答案