我正在尝试在纱线簇模式(v2.3.0)中运行Spark。传统上,我们一直在纱线客户端模式下运行,但是一些作业是从.NET Web服务提交的,因此在使用客户端模式(HostingEnvironment.QueueBackgroundWorkTime...
时,我们必须使主机进程在后台运行。我们希望我们可以采用一种更加“抛弃式”的方式执行这些作业。
我们的作业继续成功运行,但是我们在日志中看到一个奇怪的条目,在该日志中,将作业提交给应用程序管理器的yarn客户端始终报告失败:
18/11/29 16:54:35 INFO yarn.Client: Application report for application_1539978346138_110818 (state: RUNNING)
18/11/29 16:54:36 INFO yarn.Client: Application report for application_1539978346138_110818 (state: RUNNING)
18/11/29 16:54:37 INFO yarn.Client: Application report for application_1539978346138_110818 (state: FINISHED)
18/11/29 16:54:37 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: <ip address>
ApplicationMaster RPC port: 0
queue: root.default
start time: 1543510402372
final status: FAILED
tracking URL: http://server.host.com:8088/proxy/application_1539978346138_110818/
user: p800s1
Exception in thread "main" org.apache.spark.SparkException: Application application_1539978346138_110818 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1153)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1568)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:892)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/11/29 16:54:37 INFO util.ShutdownHookManager: Shutdown hook called
我们总是创建一个SparkSession并总是返回sys.exit(0)
(尽管无论我们如何提交作业,Spark框架似乎都会忽略它)。我们也有自己的内部错误日志记录,该日志记录路由到Kafka / ElasticSearch。在作业运行期间未报告任何错误。
以下是Submit命令的示例:spark2-submit --keytab /etc/keytabs/p800s1.ktf --principal p800s1@OURDOMAIN.COM --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 4g --class com.path.to.MainClass /path/to/UberJar.jar arg1 arg2
这似乎是无害的噪音,但我不喜欢我不理解的噪音。有没有人经历过类似的事情?