oozie协调器错误中的Spark作业 - emr:无法从空字符串创建路径

时间:2017-09-12 21:02:47

标签: apache-spark oozie emr oozie-coordinator

我在纱线群集中使用oozie配置协调器时遇到问题,它是一个火花作业,当我通过控制台运行工作流程时,纱线启动并正确执行作业,但是当我调用它时来自coordinator.xml的工作流程我有这个错误:

ERROR org.apache.spark.SparkContext  - Error initializing SparkContext.
java.lang.IllegalArgumentException: Can not create a Path from an empty    string
    at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
    at org.apache.hadoop.fs.Path.<init>(Path.java:135)
    at org.apache.hadoop.fs.Path.<init>(Path.java:94)
    at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:337)

这项工作从未在纱线集群中推出,看起来纱线无法接受来自oozie的.jar正确路径,任何想法?

这里简化了coordinator.xml和workflow.xml。

<coordinator-app name="Firebase acquisition process coordinator" frequency="${coord:days(1)}"
start="${startTime}" end="${endTime}" timezone="UTC" xmlns="uri:oozie:coordinator:0.5">
   <controls>
...
   </controls>
   <action>
      <workflow>
         <app-path>hdfs://ip-111-11-11-111.us-west-  2.compute.internal:8020/user/hadoop/emr-spark/</app-path>
      </workflow>
   </action>
</coordinator-app>

<workflow-app name="bbbbbbbbbbbbbbb" xmlns="uri:oozie:workflow:0.5">
    <start to="spark-0324"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="spark-0324">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <master>yarn</master>
            <mode>client</mode>
              <class>classsxxx.Process</class>
            <jar>hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020/user/hadoop/emr-spark/lib/jarnamex.jar</jar>
            <file>lib#lib</file>
        </spark>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

我的意思是,当我这样做的时候; oozie job -config~ / emr-spark / job.properties -run 它有效!!但是当我尝试这个时; oozie job -run -config~ / emr-coordinator / coordinator.properties它不起作用。

工作属性

oozie.use.system.libpath=true
send_email=False
dryrun=False
nameNode=hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020
jobTracker=ip-111-11-11-111.us-west-2.compute.internal:8032
oozie.wf.application.path=/user/hadoop/emr-spark

协调员属性

startTime=2017-09-08T19:46Z
endTime=2030-01-01T06:00Z
jobTracker=ip-111-11-11-111.us-west-2.compute.internal:8032
nameNode=hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020
oozie.coord.application.path=hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020/user/hadoop/emr-coordinator
oozie.use.system.libpath=true

1 个答案:

答案 0 :(得分:0)

参考HDFS文件系统中的资源,它必须只是相对的。 完整/绝对路径是按需计算的。

然后解决方案只是替换: hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020 / user / hadoop / emr-spark / workflow.xml with:/ user / hadoop / emr-spark / workflow。 XML 和hdfs://ip-111-11-11-111.us-west-2.compute.internal:8020 / user / hadoop / emr-spark / lib / xxxx.jar with / user / hadoop / emr-spark / LIB / xxxxx.jar

在workflow.xml,coordinator.xml或properties。