当我通过oozie运行spark作业时,它总是卡在接受状态。我遵循hornwork doc来设置spark2库。
当我对相同的火花作业使用oozie shell动作时,它可以很好地工作,并且可以通过边缘节点的火花提交以及相同的火花选择来实现。
下面是我的工作流程.xml
<global>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>${queueName}</value>
</property>
</configuration>
</global>
<credentials>
<credential name="HiveCreds" type="hive2">
<property>
<name>hive2.jdbc.url</name>
<value>jdbc:hive2://${hive2_server}:${hive2_port}/default</value>
</property>
<property>
<name>hive2.server.principal</name>
<value>hive/${hive2_server}@DOMAIN</value>
</property>
</credential>
</credentials>
<!--spark action using spark 2 libraries -->
<start to="SPARK2JOB" />
<action name="SPARK2JOB" cred="HiveCreds">
<spark
xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>${master}</master>
<mode>${mode}</mode>
<name>${appName}</name>
<class>${mainClass}</class>
<jar>${hdfsJarLoc}${uberJar}</jar>
<spark-opts>--num-executors ${noOfExec}
--executor-cores ${execCores}
--executor-memory ${execMem}
--driver-memory ${drivMem}
--driver-cores ${drivCores}
--conf spark.dynamicAllocation.enabled=${dynamicAllocation}</spark-opts>
<arg>${sourceFilePath}</arg>
<arg>${sourceFileName}</arg>
<arg>${outputFilePath}</arg>
<arg>${outputFileDir}</arg>
</spark>
<ok to="end" />
<error to="errorHandler" />
</action>
我的job.properties
jobTracker=HOST:8050
nameNode=hdfs://HOST:8020
hive2_server=HOSTNAME
hive2_port=10000
queueName=default
# Standard useful properties
oozie.use.system.libpath=true
#oozie.wf.rerun.failnodes=true
ooziePath=/path/
#oozie.coord.application.path=${ooziePath}
## Oozie path & Standard properties
oozie.wf.application.path=${ooziePath}
oozie.libpath = ${ooziePath}/Lib
oozie.action.sharelib.for.spark=spark2
master=yarn-cluster
mode=cluster
appName=APP_NAME
mainClass=MAIN_CLASS
uberJar=UBER_JAR
noOfExec=2
execCores=2
execMem=2G
drivMem=2g
drivCores=2
dynamicAllocation=false
我检查了oozie spark2库,发现我拥有/usr/hdp/2.6.3.0-235/spark2/jars/
我的oozie lib:
/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-core-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-kms-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-s3-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-data-lake-store-sdk-2.1.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-keyvault-core-0.8.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-storage-5.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/commons-lang3-3.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/guava-11.0.2.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-aws-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-azure-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-azure-datalake-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-annotations-2.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-core-2.4.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-databind-2.4.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/joda-time-2.9.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/json-simple-1.1.jar
/user/oozie/share/lib/lib_20180116141700/oozie/okhttp-2.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/okio-1.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/oozie-hadoop-utils-hadoop-2-4.2.0.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/oozie-sharelib-oozie-4.2.0.2.6.3.0-235.jar
下面是错误堆栈:
它将在ACCEPTED状态(如下所示)中停留一个小时左右
[main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1537404298109_2008 (state: ACCEPTED)
标准输出:
INFO org.apache.spark.deploy.yarn.Client - Application report for application_1537404298109_2008 (state: ACCEPTED)
2018-09-20 14:49:15,158 [main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1537404298109_2008 (state: FAILED)
2018-09-20 14:49:15,158 [main] INFO org.apache.spark.deploy.yarn.Client -
client token: N/A
diagnostics: Application application_1537404298109_2008 failed 2 times due to AM Container for appattempt_1537404298109_2008_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://hostname:8088/cluster/app/application_1537404298109_2008 Then click on links to logs of each attempt.
Diagnostics: org.apache.hadoop.security.authorize.AuthorizationException: User:yarn not allowed to do 'DECRYPT_EEK' on 'testkey1'
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1537468694601
final status: FAILED
tracking URL: http://hostname:8088/cluster/app/application_1537404298109_2008
user: username
2018-09-20 14:49:16,189 [main] INFO org.apache.spark.deploy.yarn.Client - Deleted staging directory
<<< Invocation of Spark command completed <<<<<< Invocation of Spark command completed <<<
Hadoop Job IDs executed by Spark: job_1537404298109_2008
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1537404298109_2008 finished with failed status
org.apache.spark.SparkException: Application application_1537404298109_2008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:782)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:314)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:235)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Oozie Launcher failed, finishing Hadoop job gracefully
STDERR:
org.apache.spark.SparkException: Application application_1537404298109_2008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:782)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:314)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:235)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
我尝试了在hortonworks社区中找到的所有可能的解决方案,并针对同一类型的问题进行了堆叠,但对我而言没有任何效果。如果您需要其他任何信息来帮助我,我们很乐意将其添加到问题中。
先谢谢了!
答案 0 :(得分:0)
org.apache.hadoop.security.authorize.AuthorizationException: User:yarn not allowed to do 'DECRYPT_EEK
DECRYPT_EEK是Ranger中的权限,需要授予用户。 如果您是护林员管理员,请联系管理员。