我正在尝试使用distcp将数据从s3复制到hdfs。 以下是我的shell脚本,我正在做distcp。
mkdir.sh
hadoop distcp s3n://bucket-name/foldername hdfs://localhost:8020/user/hdfs/data/
The above shell script works fine when i am running the script manually.
But when i try to run the same script using oozie workflow distcp fails.
I am trying to run the workflow using shell-action.
以下是我的job.properties文件:
nameNode=hdfs://ip-172-31-34-170.us-west-2.compute.internal:8020
jobTracker=ip-172-31-34-195.us-west-2.compute.internal:8032
queueName=default
oozie.libpath=${nameNode}/user/oozie/share/lib
user.name=hdfs
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/
mkdirshellscript=${oozie.wf.application.path}/mkdir.sh
我的workflow.xml如下:
<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
<start to="shellAction"/>
<action name="shellAction">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="/user/hdfs/hari123"/>
<mkdir path="/user/hdfs/hari123"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${mkdirshellscript}</exec>
<file>${mkdirshellscript}</file>
</shell>
<ok to="end"/>
<error to="killAction"/>
</action>
<kill name="killAction">
<message>"Killed job due to error"</message>
</kill>
<end name="end"/>
</workflow-app>
oozie日志如下:
2014-09-30 10:31:51,102 INFO org.apache.oozie.servlet.CallbackServlet: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] callback for action [0000018-140930055823135-oozie-oozi-W@shellAction]
2014-09-30 10:31:51,337 INFO org.apache.oozie.command.wf.ActionEndXCommand: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[hdfs] GROUP[-] TOKEN[] APP[WorkFlowForShellActionWithCaptureOutput] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] ERROR is considered as FAILED for SLA
我想在oozie中使用shell-action而不是distcp-action来做distcp。
答案 0 :(得分:0)
尝试:
<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
...
<start to="shellAction"/>
<action name="shellAction">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="/user/hdfs/hari123"/>
<mkdir path="/user/hdfs/hari123"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>./${mkdirshellscript}</exec>
<file>${mkdirshellscript}#${mkdirshellscript}</file>
</shell>
...
</workflow-app>