我正在尝试从shell包装器运行spark-submit。虽然作业从命令行运行良好但在通过oozie进行计划时失败。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:394)
at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
at org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:97)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
这是我的工作流程:
<workflow-app name="OozieTest1" xmlns="uri:oozie:workflow:0.5">
<start to="CopyTest"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="CopyTest">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>lib/copy.sh</exec>
<argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/file-list/xxx_xxx_201610.lst</argument>
<argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/sample</argument>
<argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/output</argument>
<argument>IMMUN</argument>
<argument>N</argument>
<argument>hdfs://xxxxxx/user/xxxxxx/oozie-test/resources/script-constants.properties</argument>
<file>hdfs://xxxxxx/user/xxxxxx/oozie-test/lib/copy.sh#copy.sh</file>
<file>hdfs://xxxxxx/user/xxxxxx/oozie-test/lib/xxxx_Integration.jar#xxxx_Integration.jar</file>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
答案 0 :(得分:0)
这取决于您使用的spark,hadoop和oozie的版本。但很可能你有一些依赖性问题。 (jar丢失了)我真的建议检查你的依赖项。在这里,您可以找到完整的工作示例here:
在这个例子中,hadoop和spark版本如下:
<hadoop.version>2.6.0-cdh5.4.7</hadoop.version>
<spark.version>1.3.0-cdh5.4.7</spark.version>