我已连接到Spark群集的主节点,在emr内部运行,并且我正在尝试提交基于python的应用程序:
spark-submit --verbose --deploy-mode cluster --master yarn-cluster --num-executors 3 --executor-cores 6 --executor-memory 1g test.py
该过程会生成一组日志转储,包括以下对集群的部署确认:
6/08/29 20:47:51 INFO Client: Uploading resource file:/home/hadoop/test.py -> hdfs://ip-xxx-xxx-xxx-xxx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1472396426409_0007/test.py
16/08/29 20:47:51 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> hdfs://ip-xxx-xxx-xxx-xxx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1472396426409_0007/pyspark.zip
16/08/29 20:47:51 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/py4j-0.10.1-src.zip -> hdfs://ip-xxx-xxx-xxx-xxx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1472396426409_0007/py4j-0.10.1-src.zip
然而,该应用程序无法运行,报告缺少py4j库? :
6/08/29 20:48:47 INFO Client: Application report for application_1472396426409_0007 (state: ACCEPTED)
16/08/29 20:48:48 INFO Client: Application report for application_1472396426409_0007 (state: FAILED)
16/08/29 20:48:48 INFO Client:
client token: N/A
diagnostics: Application application_1472396426409_0007 failed 2 times due to AM Container for appattempt_1472396426409_0007_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://ip-xxx-xxx-xxx-xxx.ec2.internal:8088/cluster/app/application_1472396426409_0007Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://ip-xxx-xxx-xxx-xxx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1472396426409_0007/py4j-0.10.1-src.zip
java.io.FileNotFoundException: File does not exist: hdfs://ip-xxx-xxx-xxx-xxx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1472396426409_0007/py4j-0.10.1-src.zip
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
我是否滥用了这个命令?
答案 0 :(得分:1)
这似乎是aws系统的一个错误。 Yarn监控系统并注意到已部署的代码不再存在 - 这实际上表明火花已完成处理。
要验证这是否是问题,请通过阅读应用程序的日志进行仔细检查 - 即,针对主节点运行类似的操作:
require
并仔细检查您是否在日志中看到成功消息:
yarn logs -applicationId application_1472396426409_0007