oozie shell动作与spark-submit

时间:2017-02-06 10:28:09

标签: shell apache-spark oozie

我正在尝试从shell包装器运行spark-submit。虽然作业从命令行运行良好但在通过oozie进行计划时失败。

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:394)
at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
at org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:97)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

这是我的工作流程:

    <workflow-app name="OozieTest1" xmlns="uri:oozie:workflow:0.5">
    <start to="CopyTest"/>
   <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
<action name="CopyTest">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>lib/copy.sh</exec>
              <argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/file-list/xxx_xxx_201610.lst</argument>
              <argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/sample</argument>
              <argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/output</argument>
              <argument>IMMUN</argument>
              <argument>N</argument>
              <argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/resources/script-constants.properties</argument>
             <file>hdfs://xxxxxx/user/xxxxxx/oozie-test/lib/copy.sh#copy.sh</file> 
             <file>hdfs://xxxxxx/user/xxxxxx/oozie-test/lib/xxxx_Integration.jar#xxxx_Integration.jar</file>
        <capture-output/>
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

1 个答案:

答案 0 :(得分:0)

这取决于您使用的spark,hadoop和oozie的版本。但很可能你有一些依赖性问题。 (jar丢失了)我真的建议检查你的依赖项。在这里,您可以找到完整的工作示例here

在这个例子中,hadoop和spark版本如下:

<hadoop.version>2.6.0-cdh5.4.7</hadoop.version>
<spark.version>1.3.0-cdh5.4.7</spark.version>