是否可以运行Oozie Spark Action而无需指定inputDir& outputDir

时间:2018-05-24 18:10:27

标签: oozie mapr oozie-workflow

根据https://oozie.apache.org/docs/3.3.1/WorkflowFunctionalSpec.html#a4.1_Workflow_Job_Properties_or_Parameters我们知道..

When submitting a workflow job for the workflow definition above, 3 workflow job properties must be specified:

jobTracker:
inputDir:
outputDir:

我有一个PySpark脚本,它已指定输入&脚本本身的输出位置。我不需要并且想要在我的工作流XML中使用inputDiroutputDir。通过Oozie运行我的PySpark脚本时,我收到此错误消息。

WARN ParameterVerifier:523 - SERVER[<my_server>] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition

 WARN JobResourceUploader:64 - SERVER[<my_server>] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

2018-05-24 11:52:29,844  WARN JobResourceUploader:171 - SERVER[<my_server>] No job jar file set.  User classes may not be found. See Job or Job#setJar(String).

基于https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/util/ParameterVerifier.java,我的第一个警告是由于我没有“inputDir”

else {
                // Log a warning when the <parameters> section is missing
                XLog.getLog(ParameterVerifier.class).warn("The application does not define formal parameters in its XML "
                        + "definition");
            }

我可以解决这个问题吗?

更新 - 我的XML结构

<action name="spark-node">
      <spark xmlns="uri:oozie:spark-action:0.1" >
         <job-tracker>${jobTracker}</job-tracker>
         <name-node>${nameNode}</name-node>
         <configuration>
             <property>
                <name>mapred.job.queue.name</name>
                <value>${queueName}</value>
             </property>
             <property>
                <name>mapred.input.dir</name>
                <value>${inputDir}</value>
             </property>
             <property>
                <name>mapred.output.dir</name>
                <value>${outputDir}</value>
             </property>
         </configuration>
         <master>yarn-master</master>
         <!-- <mode>client</mode>  -->
         <name>oozie_test</name>
         <jar>oozie_test.py</jar>
          <spark-opts>--num-executors 1 --executor-memory 10G --executor-cores  1 --driver-memory 1G</spark-opts>
      </spark>
      <ok to="end" />
      <error to="fail" />
   </action>

0 个答案:

没有答案