Oozie:Sqoop动态目标目录

时间:2015-01-26 12:18:32

标签: linux hadoop sqoop oozie oozie-coordinator


我正在sqoop工作流程执行Oozie工作。我可以在sqoop命令中使用静态名称创建目标目录,如下所示。

<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/20150126</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

我需要用日期创建动态目标目录。我尝试了以下,但没有工作。

<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/$(date +%Y%m%d)</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

运行时显示以下错误。

 3622 [main] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.5-cdh5.2.0
  3957 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Error parsing arguments for import:
  3957 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: +%Y%m%d)
  Intercepting System.exit(1)

4 个答案:

答案 0 :(得分:1)

您可以将coordina时间从coordinator.xml传递到workflow.xml。所以在工作流程中你可以喜欢这个

/user/test/Employee/${timePassedFromCoordinator}

答案 1 :(得分:0)

您可以使用shell操作通过以下方法将env-var指定为日期。

<强> “variable.sh”

#!/bin/sh
outputDir =$(date +%Y%m%d)

<强> Workflow.xml

<action name='shell1'>
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>variable.sh</exec>
                      (or)
            <env-var>[outputDir=$(date +%Y%m%d)]</env-var>
        </shell>
        <ok to="table1" />
        <error to="fail" />
</action>
<action name="table1" cred="">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>job --exec EMPLOYEE --meta-connect jdbc:hsqldb:hsql://<host>:<port>/sqoop -- --target-dir /user/test/Employee/$(outputDir)</command>
        </sqoop>
        <ok to="end" />
        <error to="kill" />
</action>

答案 2 :(得分:0)

在协调员中,您可以按照

中的要求格式获取日期和格式
<action>
        <workflow>
            <app-path>${WF_Maig_1}</app-path>
            <configuration>

                <property><name>currentbatchtime</name><value>${coord:formatTime(coord:dateOffset(coord:nominalTime(),0,'DAY'),"yyyy-MM-dd")}</value></property>
                <property><name>nextbatchtime</name><value>${coord:formatTime(coord:dateOffset(coord:nominalTime(),1,'DAY'),"yyyy-MM-dd")}</value></property>
            </configuration>
        </workflow>
    </action>

现在您可以在workflow.xml和属性文件中使用/ user / test / Employee / $ {currentbatchtime}

答案 3 :(得分:0)

捕获输出对我有帮助!

<action name='custom-var'>
   <shell xmlns="uri:oozie:shell-action:0.1">
      ...
     <exec>set_variable.sh</exec>
     <file>set_variable.sh</file>
     <capture-output/>
   </shell>
</action>

<action name='sqoop-test'>
  <sqoop xmlns="uri:oozie:sqoop-action:0.2">
    ...
    <command> --target-dir /test/${wf:actionData('custom-var')['var1']}  --m 1 </command>
  </sqoop>
</action>

set_variable.sh
echo "var1=$(date +%Y/%m/%d)"