将Windows机器上的Spark作业提交到远程纱线群集(Unix)

时间:2019-02-15 12:48:54

标签: apache-spark yarn

我已经在Windows计算机中设置了Apache Spark,并尝试将pyspark作业提交到远程kerberized hadoop集群(YARN)

我可以提交作业,但失败,并出现以下错误

19/02/15 17:57:18 INFO yarn.Client:
     client token: N/A
     diagnostics: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:mycluster, Ident: (token for viru: HDFS_DELEGATION_TOKEN owner=viru@ABC.COM, **renewer=nobody**, realUser=, issueDate=1550233620463, maxDate=1550838420463, sequenceNumber=8415071, masterKeyId=1254)
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: root.ABC.ABC
     start time: 1550233636757
     final status: FAILED
     tracking URL: https://ABC:8090/proxy/application_1549982169433_29638/
     user: viru

我的spark-defaults.conf

spark.yarn.appMasterEnv.PYSPARK_PYTHON=/opt/anaconda/anaconda3/bin/python3
spark.executor.extraClassPath=/opt/cloudera/parcels/SPARK2/lib/spark2/jars/commons-lang3-3.3.2.jar
spark.driver.extraClassPath=C:\Users\viru\Downloads\jupyter\spark-2.3.0-bin-hadoop2.6\jars\commons-lang3-3.5.jar
spark.master=yarn
spark.submit.deployMode=client
spark.yarn.jars=local:/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/lib/spark2/jars/*
spark.driver.extraLibraryPath=C:\Users\viru\Downloads\jupyter\spark-2.3.0-bin-hadoop2.6\jars
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p3163.3423/lib/hadoop/lib/native
spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p3163.3423/lib/hadoop/lib/native
spark.yarn.config.gatewayPath=/opt/cloudera/parcels
spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
spark.driver.memory=1G
spark.yarn.am.memory=1G
spark.yarn.queue=root.ABC.ABC
spark.driver.host=1.2.3.4

我已经在Windows计算机上安装了MIT Kerberos,并且也获得了令牌

有人可以帮我吗?

0 个答案:

没有答案