为Oozie Coordinator设置Kerberos身份验证

时间:2020-04-14 14:55:48

标签: hadoop oozie oozie-coordinator

我正在从远程HDFS到我的HDFS进行Spark复制。

我有一个Oozie协调器,每天检查一次,如果远程HDFS的指定目录中的数据可用,然后运行工作流程

coordinator.xml:

<coordinator-app name="My App" frequency="${coord:days(1)}" start="${startTime}" end="${endTime}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1">

  <datasets>
    <dataset name="hdfsDirectory" frequency="${coord:days(1)}" initial-instance="${startTime}" timezone="UTC">
      <uri-template>${hdfsDirectoryToPoll}/partition=${YEAR}-${MONTH}-${DAY}</uri-template>
      <done-flag></done-flag>
    </dataset>
  </datasets>

  <input-events>
    <data-in name="sourceFile" dataset="hdfsDirectory">
      <start-instance>${coord:current(-1)}</start-instance>
      <end-instance>${coord:current(0)}</end-instance>
    </data-in>
  </input-events>

  <action>
    <workflow>
      <app-path>${workflowPath}</app-path>
      <configuration>
        <property>
          <name>source</name>
          <value>${coord:dataIn('sourceFile')}</value>
        </property>
      </configuration>
    </workflow>
  </action>
</coordinator-app>

workflow.xml:

<action name = "action1">
    <ssh xmlns="uri:oozie:ssh-action:0.1">
        <host>${remoteNode}</host>
        <command>${sparkSubmitCommand}</command>
    </ssh>
    <ok to = "end" />
    <error to = "kill" />
</action>
<kill name="kill">
    <message>Action failed, error message - [${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name = "end" />

job.properties:

remoteNode=remoteNode
nameNode=hdfs://test
jobTracker=test:8050


hdfsDirectoryToPoll=hdfs://remoteNode/path/to/data
sparkSubmitCommand=spark-submit spark-jar.jar    
oozie.coord.application.path=${nameNode}/path/to/workflow
oozie.use.system.libpath=true

workflowPath=${nameNode}/path/to/workflow

startTime=2018-08-09T09:00Z
endTime=2018-08-10T09:00Z

但是我的问题是远程集群被kerberized,我在spark应用程序中执行了kinit,并且运行良好,但是我需要在协调器中执行相同的操作。

这是错误:

2020-04-14 14:18:51,437 ERROR CoordOldInputDependency:517 - SERVER[<server>] USER[-] GROUP[-] TOKEN[-] APP[-]637-oozie-oozi-C@1] org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [org.apache.hadoop.ipc.RemoteExceptionrized connection for super-user: oozie/<server>@<realm> from IP <ip>]
org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.s super-user: oozie/<server>@<realm> from IP <ip>]
        at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:113)
        at org.apache.oozie.command.coord.CoordCommandUtils.pathExists(CoordCommandUtils.java:877)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.pathExists(CoordOldInputDependency.java:220)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.checkListOfPaths(CoordOldInputDependency.java:200)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.checkPullMissingDependencies(CoordOldInputDependency.java:1
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkResolvedInput(CoordActionInputCheckXCommand.java:323)
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:173)
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:63)
        at org.apache.oozie.command.XCommand.call(XCommand.java:287)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): Unauthorized connectionm IP <ip>
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy31.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:82
        at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy32.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
        at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:101)
        ... 13 more
2020-04-14 14:18:51,442 ERROR CoordActionInputCheckXCommand:517 - SERVER[<server>] USER[-] GROUP[-] TOKEN[-] 163133637-oozie-oozi-C@1] XException,
org.apache.oozie.command.CommandException: E1021: Coord Action Input Check Error: org.apache.oozie.service.HadoopAccessorException: E0che.hadoop.security.authorize.AuthorizationException): Unauthorized connection for super-user: oozie/<server>
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:237)
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:63)
        at org.apache.oozie.command.XCommand.call(XCommand.java:287)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [org.apache.hadoop.ipc.Remon): Unauthorized connection for super-user: oozie/<server>@<realm> from IP <ip>]
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.pathExists(CoordOldInputDependency.java:232)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.checkListOfPaths(CoordOldInputDependency.java:200)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.checkPullMissingDependencies(CoordOldInputDependency.java:1
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkResolvedInput(CoordActionInputCheckXCommand.java:323)
        at org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:173)
        ... 7 more
Caused by: org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [org.apache.hadoop.ipc.RemoteException(org.apacnection for super-user: oozie/<server>@<realm> from IP <ip>]
        at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:113)
        at org.apache.oozie.command.coord.CoordCommandUtils.pathExists(CoordCommandUtils.java:877)
        at org.apache.oozie.coord.input.dependency.CoordOldInputDependency.pathExists(CoordOldInputDependency.java:220)
        ... 11 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): Unauthorized connectionm IP <ip>
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy31.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:82
        at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy32.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
        at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:101)
        ... 13 more

有什么建议吗?我们可以向协调员提供自定义的oozie-site.xml吗?

0 个答案:

没有答案