运行后Spark集群失败,没有抛出异常

时间:2015-04-15 21:31:44

标签: amazon-web-services amazon-ec2 apache-spark yarn emr

我正在尝试在EC2 Yarn命令行上运行独立的Spark应用程序。我正在提交以下spark-submit脚本:

./bin/spark-submit   --class PageRankGraphX   --master yarn-cluster  --properties-file spark-defaults.conf.2   --executor-memory 2G   --total-executor-cores 5   ./SparkPageRank-assembly-1.0.jar s3://linkfilefull/full/links_small.txt s3://conansoutputbucket/smalloutput.txt 10 0.15 2

这是输出 - 没有异常或抛出错误,作业在运行后完全失败:

15/04/15 21:27:03 INFO yarn.Client: Application report from ASM:
         application identifier: application_1429126831428_0027
         appId: 27
         clientToAMToken: null
         appDiagnostics:
         appMasterHost: ip-172-31-1-67.eu-west-1.compute.internal
         appQueue: default
         appMasterRpcPort: 0
         appStartTime: 1429133214320
         yarnAppState: RUNNING
         distributedFinalState: UNDEFINED
         appTrackingUrl: http://172.31.10.227:9046/proxy/application_1429126831428_0027/
         appUser: hadoop
15/04/15 21:27:04 INFO yarn.Client: Application report from ASM:
         application identifier: application_1429126831428_0027
         appId: 27
         clientToAMToken: null
         appDiagnostics:
         appMasterHost: ip-172-31-1-67.eu-west-1.compute.internal
         appQueue: default
         appMasterRpcPort: 0
         appStartTime: 1429133214320
         yarnAppState: FINISHED
         distributedFinalState: FAILED
         appTrackingUrl: http://172.31.10.227:9046/proxy/application_1429126831428_0027/A
         appUser: hadoop

是否有人知道可能导致此问题的原因或我如何调查?当我尝试访问纱线日志时,它表示日志已禁用或未准备好。

1 个答案:

答案 0 :(得分:-1)

关于启用对Hadoop的Web UI的访问权限,请查看Amazon's documentation。进入UI后,您可以检查应用程序的stderr输出,最有可能出现异常。正如其他人所提到的,这个日志也将在S3上提供。