PySpark作业不断失败,并在EMR上显示退出代码13

时间:2020-03-04 17:26:01

标签: amazon-web-services apache-spark amazon-emr

我有一个简单的Spark脚本,并且希望通过步骤在EMR上执行。就是这样:

FileInDLK_ul = "s3://Bucket/something.csv.gz"
df_ul = spark.read.csv(FileInDLK_ul, header=True)

df_ul.repartition(10).write.format("parquet").save("s3://AnotherBucket")

当我通过zeeplin对其进行测试时,它可以完美运行。

当我在EMR步骤上启动它时,它立即失败并显示:

20/03/04 17:16:36 INFO Client: Application report for application_1583330635514_0007 (state: ACCEPTED)
20/03/04 17:16:37 INFO Client: Application report for application_1583330635514_0007 (state: ACCEPTED)
20/03/04 17:16:38 INFO Client: Application report for application_1583330635514_0007 (state: FAILED)
20/03/04 17:16:38 INFO Client: 
     client token: N/A
     diagnostics: Application application_1583330635514_0007 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1583330635514_0007_000001 exited with  exitCode: 13
Failing this attempt.Diagnostics: Exception from container-launch.
Container id: container_1583330635514_0007_01_000001
Exit code: 13
Stack trace: ExitCodeException exitCode=13: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
    at org.apache.hadoop.util.Shell.run(Shell.java:869)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:235)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

这就是我用作Step参数的地方:

enter image description here

我想念什么?

0 个答案:

没有答案