在YARN群集上执行Flink示例代码时Kerberos身份验证出错(Cloudera)

时间:2015-08-19 03:22:06

标签: apache-flink

我在YARN群集上尝试使用Flink来运行示例代码(flink examples WordCount.jar)但是我收到了以下安全身份验证错误。

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

我不确定问题出在哪里以及我缺少的是什么。我可以在同一个cloudera hadoop集群中运行spark或map-reduce作业而没有任何问题。

我确实更新了flink-conf.yaml中的hdfs-site.xml和core-site.xml的CONF文件路径(在Master和Worker节点中更新相同),并且还导出了HADOOP_CONF_DIR路径。此外,我尝试在执行flink run命令时在HDFS文件路径中给出host:port。

错误消息

    22:14:25,138 ERROR   org.apache.flink.client.CliFrontend                           - Error while running the command.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
        at org.apache.flink.client.program.Client.run(Client.java:413)
        at org.apache.flink.client.program.Client.run(Client.java:356)
        at org.apache.flink.client.program.Client.run(Client.java:349)
        at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63)
        at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:78)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
        at org.apache.flink.client.program.Client.run(Client.java:315)
        at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:584)
        at org.apache.flink.client.CliFrontend.run(CliFrontend.java:290)
        at org.apache.flink.client.CliFrontend$2.run(CliFrontend.java:873)
        at org.apache.flink.client.CliFrontend$2.run(CliFrontend.java:870)
        at org.apache.flink.runtime.security.SecurityUtils$1.run(SecurityUtils.java:50)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.flink.runtime.security.SecurityUtils.runSecured(SecurityUtils.java:47)
        at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:870)
        at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

2 个答案:

答案 0 :(得分:0)

您遇到的问题可能与导出HADOOP_CONF_DIR以使Hadoop配置文件对Flink可见但在HADOOP_CONF_DIR本身的值中无关! 如果您使用的是Cloudera Manager,请确保您所指的位置正确并且确实存在于所有节点中。 此外,值得尝试Hadoop配置文件的以下常见位置:/ etc / hadoop / conf

导出HADOOP_CONF_DIR = / etc / hadoop / conf

答案 1 :(得分:0)

(我与原问题的作者私下交谈,以找出这个解决方案)

原始问题的评论中发布的日志文件表明该作业是针对Flink的独立安装提交的。如果用户在所有工作节点上进行身份验证,则Standalone Flink当前仅支持访问Kerberos安全HDFS。 对于YARN上的Flink,只有在YARN上启动作业的用户需要使用Kerberos进行身份验证。

此外,在评论部分,还有另一个问题:

robert@cdh544-worker-0:~/hd22/flink-0.9.0$ ./bin/yarn-session.sh -n 2
20:39:50,563 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
20:39:50,600 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Using values:
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  TaskManager count = 2
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  JobManager memory = 1024
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  TaskManager memory = 1024
20:39:51,708 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:52,710 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:53,712 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:54,714 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

问题是您在启用了YARN HA的Hadoop / YARN 2.6.0的群集上使用Flink 0.9.0(包含Hadoop 2.2.0)。 Flink的旧版(2.2.0)Hadoop库无法正确读取HA设置的ResourceManager地址。

下载Flink(使用Hadoop 2.6.0)将使其正常工作。