如何在bluemix上增加spark-submit作业的日志输出?

时间:2016-05-12 19:20:07

标签: python apache-spark ibm-cloud

我已经向bluemix spark提交了一个python作为服务,它已经失败了。不幸的是,日志记录不足,并不能让我知道它失败的原因。

如何增加日志级别输出?

火花作为服务的输出:

==== Failed Status output =====================================================

Getting status
HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Thu, 12 May 2016 19:09:30 GMT
Content-Type: application/json;charset=utf-8
Content-Length: 850
Connection: keep-alive

{
  "action" : "SubmissionStatusResponse",
  "driverState" : "ERROR",
  "message" : "Exception from the cluster:
org.apache.spark.SparkUserAppException: User application exited with 255
    org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:88)
    org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
    java.lang.reflect.Method.invoke(Method.java:507)
    org.apache.spark.deploy.ego.EGOClusterDriverWrapper$$anon$3.run(EGOClusterDriverWrapper.scala:430)",
  "serverSparkVersion" : "1.6.0",
  "submissionId" : "xxxxxx",
  "success" : true
}
===============================================================================

我已成功针对BigInsights群集运行相同的作业。在biginsights集群上运行时,我也会得到更详细的输出。

1 个答案:

答案 0 :(得分:2)

从群集下载stderr-%timestamp%stdout-%timestamp%个文件到您运行spark-submit.sh的本地目录。 通常,您会在这两个文件中找到工作问题。

参考:http://spark.apache.org/docs/latest/spark-standalone.html#monitoring-and-logging