我正在尝试在oozie工作流程中运行pyspark脚本,但该脚本未触发。我应该对代码进行哪些更改?
我正在尝试运行一个pyspark脚本,当使用spark-submit命令直接在终端上执行该脚本时,该脚本可以正常运行。在尝试使用oozie工作流运行时,我的脚本根本不会被触发。请参考以下工作流程。标签和点火选项会导致此问题吗?
<workflow-app name="My_Workflow"
xmlns="uri:oozie:workflow:0.5">
<global>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>NONP.${queueName}</value>
</property>
</configuration>
</global>
<start to="cmp_feed_delta_process-A"/>
<action name="cmp_feed_delta_process-A" cred='hive_credentials'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<master>yarn</master>
<mode>cluster</mode>
<name>JB_CMP_OBS</name>
<jar>${nameNode}/user/HAASAAD0404_04647/cmp/daa/temp/JB_CMP_OBS.py</jar>
<spark-opts>--queue ${env}.${queueName} --files ${nameNode}/user/CoreSiteXMLs/hive-site.xml --executor-memory 25g --driver-memory 4g --num-executors 100 --executor-cores 5 --conf spark.executor.extrajavaoptions="-XX:-UseGCOverheadLimit"
</spark-opts>
</spark>
<ok to="EMAIL_SUCCESS"/>
<error to="EMAIL_FAILURE"/>
</action>
<action name="EMAIL_SUCCESS">
<email
xmlns="uri:oozie:email-action:0.1">
<to>${success_emails}</to>
<subject>SUCCESS with workflow ID: ${wf:id()}</subject>
<body>Hi,
Data loading with workflow ID: ${wf:id()} completed successfully !
This is an auto-generated email.
Please do not reply to this email.
Thanks
Big Data Team.
</body>
</email>
<ok to="end"/>
<error to="kill"/>
</action>
<action name="EMAIL_FAILURE">
<email
xmlns="uri:oozie:email-action:0.1">
<to>${failure_emails}</to>
<subject>FAILURE with workflow ID: ${wf:id()}</subject>
<body>Hi,
Data loading with workflow ID: ${wf:id()} failed !!!
This is an auto-generated email.
Please do not reply to this email.
Thanks,
Big Data Team.
</body>
</email>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode))}]
</message>
</kill>
<end name="end"/>
</workflow-app>
我希望按照脚本编码将输出存储在表中。 Oozie工作流状态成功,但脚本未触发。