EMR 5.0 + Spark在无限循环中获得堆栈

时间:2016-08-30 15:57:42

标签: apache-spark spark-streaming emr amazon-emr

我正在尝试通过 Amazon EMR 5.0 部署 Spark 2.0 Streaming 。 似乎应用程序陷入无限循环的日志 “无限循环”INFO客户:application_14111979683_1111的申请报告(状态:已接受)“ 然后退出。

以下是我尝试通过命令行提交的方法:

  

aws emr add-steps --cluster-id --steps   类型=星火,名称=“星火   程序”,ActionOnFailure = CONTINUE,参数数量= [--deploy模式,簇, - 类,, S3://.jar]

任何想法?

感谢, 叶兰

16/08/30 15:43:27 INFO SecurityManager: Changing view acls to: hadoop
16/08/30 15:43:27 INFO SecurityManager: Changing modify acls to: hadoop
16/08/30 15:43:27 INFO SecurityManager: Changing view acls groups to: 
16/08/30 15:43:27 INFO SecurityManager: Changing modify acls groups to: 
16/08/30 15:43:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadoop); groups with view permissions: Set(); users  with modify permissions: Set(hadoop); groups with modify permissions: Set()
16/08/30 15:43:27 INFO Client: Submitting application application_14111979683_1111 to ResourceManager
16/08/30 15:43:27 INFO YarnClientImpl: Submitted application application_14111979683_1111
16/08/30 15:43:28 INFO Client: Application report for application_14111979683_1111 (state: ACCEPTED)
16/08/30 15:43:28 INFO Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1472571807467
     final status: UNDEFINED
     tracking URL: http://xxxxxx:20888/proxy/application_14111979683_1111/
     user: hadoop
16/08/30 15:43:29 INFO Client: Application report for application_14111979683_1111 (state: ACCEPTED)

这引发了异常:

16/08/31 08:14:48 INFO Client: 
     client token: N/A
     diagnostics: Application application_1472630652740_0001 failed 2 times due to AM Container for appattempt_1472630652740_0001_000002 exited with  exitCode: 13
For more detailed output, check application tracking page:http://ip-10-0-0-8.eu-west-1.compute.internal:8088/cluster/app/application_1472630652740_0001Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1472630652740_0001_02_000001
Exit code: 13
Stack trace: ExitCodeException exitCode=13: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

EMR实际上是Yarn的包装器。 所以,我们需要添加" - master yarn"作为部署命令行的参数。 例: aws emr add-steps --cluster-id j-XXXXXXXXX --steps Type = Spark,Name =" Spark Program",ActionOnFailure = CONTINUE,Args = [ - deploy-mode,cluster, - master ,纱, - 类,com.xxx.MyMainClass,S3://]

需要的另一件事是删除' sparkConf.setMaster(" local [*]")', 从spark conf的初始化开始。