在EMR群集中提交Flink作业:无法初始化群集入口点YarnJobClusterEntrypoint

时间:2020-06-17 23:30:25

标签: apache-flink amazon-emr flink-streaming

我正在使用EMR 5.30.0,并尝试使用以下命令提交Flink(1.10.0)作业

flink run -m yarn-cluster /home/hadoop/flink--test-0.0.1-SNAPSHOT.jar

我收到以下错误:

Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. 

浏览完工作节点和作业管理器上的日志后,看起来好像有端口冲突

2020-06-17 21:40:51,199 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Could not start cluster entrypoint YarnJobClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
        at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
        at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:261)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:215)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
        ... 2 more
Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 8081
        at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:219)
        at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:165)
        ... 9 more

似乎已为此打开JIRA票证(https://issues.apache.org/jira/browse/FLINK-15394)(尽管它是1.9版的Flink),建议的解决方案是将端口范围用于 rest.bind-port >在Flink配置文件中。

但是在1.10版本的Flink中,我们仅遵循Yan Conf YML文件

rest.port: 8081

我面临的另一个问题是我已经使用AWS控制台并通过添加步骤ui提交了多个Flink作业(多次相同的作业)。只有一项工作成功完成,其余工作均因上述错误而失败。当我进入Flink UI时,它根本不显示任何作业。

想知道是否每个提交的作业都试图创建Flink Yarn会话而不是使用现有的。

谢谢 卫星

1 个答案:

答案 0 :(得分:0)

我能够解决它。似乎有端口冲突,我必须使用一系列端口并注释掉rest.port:8081

#rest.port: 8081
rest.bind-port: 50100-50200

谢谢