我正在使用EMR通过纱线运行火花工作。作业失败但我无法找到EMR记录异常的位置,我可以看到主节点屏幕输出上的回溯如下所示。应该有一个更详细的日志文件,它显示导致异常的原因,但我无法找到它的位置。我查看了hdfs:// var / log / spark / app / application_xxx,它没有显示任何错误。以下是我提交申请的方式:
spark-submit --deploy-mode cluster --master yarn --num-executors 1 --executor-cores 2 --executor-memory 5g word2vec_app.py hdfs:///test/r8_no_sto.txt
以下是主节点上的屏幕输出:
Exception in thread "main" org.apache.spark.SparkException: Application application_1488419676573_0005 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1167)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1213)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
答案 0 :(得分:2)
节点的stdout / stderr可以在每个节点的/ mnt / var / log / hadoop-yarn / containers / application中找到。
答案 1 :(得分:0)
你的s3中会有一个文件夹。虽然配置EMR我认为我们可以选择设置日志目录。路径为s3:// aws-logs- [ACCOUNT_NUMBER] - [AVAILABILITY_ZONE] / elasticma preduce /