Oozie shell工作流程

时间:2016-05-04 11:40:35

标签: shell hadoop hdfs oozie oozie-coordinator

我正在尝试在oozie中编写一个简单的shell动作,它会将文件从远程复制到hdfs.But我收到错误。

这是我的工作流程文件

<workflow-app name="WorkFlowCopyLocalTohdfs" xmlns="uri:oozie:workflow:0.1">
<start to="sshAction"/>
<action name="sshAction">
    <shell xmlns="uri:oozie:shell-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <exec>/user/root5/Oozie/Workflow/WorkFlowCopyLocalTohdfs/uploadFile.sh</exec>
        <file>/user/root5/Oozie/Workflow/WorkFlowCopyLocalTohdfs/uploadFile.sh#upload    File.sh</file>
      <capture-output/>
</shell>
<ok to="end" />
    <error to="killAction"/>
</action>
<kill name="killAction">
    <message>"Killed job due to error"</message>
</kill>
<end name="end"/>
</workflow-app>

我的uploadFile.sh是

#!/bin/bash -e

hadoop fs -copyFromLocal    /home/root5/Desktop/Avinash_sampleData/DataFolder/Data_04-05-2016 /user/root5/Oozie/DataFolder

我的job.properties是

nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default

oozie.libpath=${nameNode}/user/root/oozie-workflows/lib
oozie.use.system.libpath=true
oozie.wf.rerun.failnodes=true

oozieProjectRoot=${nameNode}/user/root5/Oozie
appPath=${oozieProjectRoot}/Workflow/WorkFlowCopyLocalTohdfs
oozie.wf.application.path=${appPath}

#inputDir=${oozieProjectRoot}/data
focusNodeLogin=root@localhost

oozie中的堆栈跟踪是

2016-05-04 16:09:36,023  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] Start action [0000012-160425173341619-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:36,023  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] [***0000012-160425173341619-oozie-oozi-W@:start:***]Action status=DONE
2016-05-04 16:09:36,023  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] [***0000012-160425173341619-oozie-oozi-W@:start:***]Action updated in DB!
2016-05-04 16:09:36,209  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] Start action [0000012-160425173341619-oozie-oozi-W@sshAction] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:36,353  WARN ShellActionExecutor:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] credentials is null for the action
2016-05-04 16:09:37,441  INFO ShellActionExecutor:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] checking action, external ID [job_201604251732_0160] status [RUNNING]
2016-05-04 16:09:37,544  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] [***0000012-160425173341619-oozie-oozi-W@sshAction***]Action status=RUNNING
2016-05-04 16:09:37,544  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] [***0000012-160425173341619-oozie-oozi-W@sshAction***]Action updated in DB!
2016-05-04 16:09:53,082  INFO CallbackServlet:539 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] callback for action [0000012-160425173341619-oozie-oozi-W@sshAction]
2016-05-04 16:09:53,317  INFO ShellActionExecutor:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] action completed, external ID [job_201604251732_0160]
2016-05-04 16:09:53,346  WARN ShellActionExecutor:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
2016-05-04 16:09:53,576  INFO ActionEndXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] ERROR is considered as FAILED for SLA
2016-05-04 16:09:53,754  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] Start action [0000012-160425173341619-oozie-oozi-W@killAction] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:53,755  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] [***0000012-160425173341619-oozie-oozi-W@killAction***]Action status=DONE
2016-05-04 16:09:53,755  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] [***0000012-160425173341619-oozie-oozi-W@killAction***]Action updated in DB!
2016-05-04 16:09:53,943  WARN CoordActionUpdateXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100

请继续帮助我。

Hive-workflow.xml
<workflow-app name="WorkFlowCopyLocalTohdfs" xmlns="uri:oozie:workflow:0.1">
<start to="hive-node"/>
<action name="hive-node">
    <hive xmlns="uri:oozie:hive-action:0.2">
       <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <job-xml>hive-site.xml</job-xml>
        <configuration>
            <property>
                <name>mapred.job.queue.name</name>
                <value>default</value>
            </property>
            <property>
                <name>oozie.hive.defaults</name>
                <value>hive-site.xml</value>
            </property>
        </configuration>
        <script>Hive_script.hql</script>
    </hive>
    <ok to="end"/>
    <error to="killAction"/>
</action>
<kill name="killAction">
    <message>"Hive failed, error   message[${wf:errorMessage(wf:lastErrorNode())}]"</message>
</kill>
<end name="end"/>
</workflow-app>
And the Hive_script.hql
.# LOAD DATA inpath '/user/root5/Oozie/DataFolder/Data_04_05_2016.txt' INTO TABLE OOZIE_TABLE1;

1 个答案:

答案 0 :(得分:0)

我认为你在这里遇到了一个基本问题。当您提交Oozie工作流程时,您永远不知道工作流程在哪个节点上执行。因此,您永远不应该在oozie中引用本地文件系统。

你能做什么呢?

  • 手动将文件放入hdfs路径
  • 将此路径实施到您的工作流程中
  • 让oozie从那里复制文件

还要确保为已安装的版本使用正确的hadoop shell命令。我习惯了像hdfs dfs -put这样的东西,但你可能会使用不同的版本。