Oozie操作Spark2:Kerberos委派令牌错误

时间:2019-02-21 11:15:55

标签: pyspark kerberos oozie

我有一个带有kerberos的集群cloudera 5.12, 我可以使用spark2-submit启动一个简单的pyspark脚本,而不会出现错误,但是当我尝试使用oozie启动相同的脚本时,我得到了:

  

启动器错误,原因:主类[org.apache.oozie.action.hadoop.SparkMain],main()抛出异常,只能使用kerberos或Web身份验证来颁发委托令牌。       在org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:7503)       在org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:548)       在org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getDelegationToken(AuthorizationProviderProxyClientProtocol.java:663)处       在org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:981)       在org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos $ ClientNamenodeProtocol $ 2.callBlockingMethod(ClientNamenodeProtocolProtos.java)       在org.apache.hadoop.ipc.ProtobufRpcEngine $ Server $ ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)       在org.apache.hadoop.ipc.RPC $ Server.call(RPC.java:1073)       在org.apache.hadoop.ipc.Server $ Handler $ 1.run(Server.java:2217)       在org.apache.hadoop.ipc.Server $ Handler $ 1.run(Server.java:2213)       在java.security.AccessController.doPrivileged(本机方法)       在javax.security.auth.Subject.doAs(Subject.java:422)       在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)       在org.apache.hadoop.ipc.Server $ Handler.run(Server.java:2211)

我的工作流程是:

<workflow-app name="${wfAppName}" xmlns="uri:oozie:workflow:0.5">
<credentials>

 <credential name="hcatauth" type="hcat">

    <property>

       <name>hcat.metastore.uri</name>

       <value>thrift://XXXXX:9083</value>

    </property>

    <property>

        <name>hcat.metastore.principal</name>

        <value>hive/_HOST@XXXXX</value>

    </property>

 </credential>

</credentials>

<start to="spark-launch"/>



<kill name="Kill">

    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>

</kill>
<action name="spark-launch" cred="hcatauth">

    <spark xmlns="uri:oozie:spark-action:0.2">  

            <job-tracker>${jobTracker}</job-tracker>

            <name-node>${nameNode}</name-node>

            <master>yarn</master>

            <mode>cluster</mode>

            <name>TestMe</name>

            <jar>mini_job.py</jar>
           <spark-opts>
           --files hive-site.xml
               --queue interne_isoprod  
           --principal interne_isoprod@XXXX
           --keytab interne_isoprod.keytab                  
           </spark-opts>


            <file>${projectPath}/scripts/mini_job.py#mini_job.py</file>
            <file>${projectPath}/lib/hive-site.xml#hive-site.xml</file>
            <file>${projectPath}/lib/interne_isoprod.keytab#interne_isoprod.keytab</file>
    </spark>

    <ok to="End"/>

    <error to="Kill"/>

和pyspark脚本是:

from datetime import datetime
from pyspark.sql import SparkSession
import sys

if __name__ == "__main__":

   spark = SparkSession.builder.appName('test_spark2').getOrCreate()
   print(spark.version)
   spark.stop()

我尝试了运气,没有运气

0 个答案:

没有答案