带Oozie的Sqoop将lastvalue打印到新行

时间:2019-06-18 12:37:55

标签: sqoop oozie oozie-workflow

下面是我在oozie中的sqoop命令。

<action name="sqoop_test" retry-max="${maxretry}" retry-interval="${retryinterval}">
    <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <command>import --connect jdbc:mysql:loadbalance://sql01.sboxdc.com/mydb --username usr1 --password ******** --table source_table --incremental lastmodified -check-column last_modified --merge-key Id --last-value "${wf:actionData('get_last_modified_time')['last_modified_date']}" --target-dir /warehouse/external_data/sms/target_location --as-textfile </command>
    </sqoop>
    <ok to="end"/>
    <error to="fail"/>
</action>

上述动作失败,因为它会将最后一个值破坏为新行。

来自日志:

Sqoop command arguments :
         import
         --connect
         jdbc:mysql:loadbalance://sql01.sboxdc.com/mydb
         --username
         usr1
         --password
         ********
         --table
         source_table
         --incremental
         lastmodified
         -check-column
         last_modified
         --merge-key
         Id
         --last-value
         "2019-01-01
         00:00:00"
         --target-dir
         /warehouse/external_data/sms/target_location
         --as-textfile


2019-06-18 11:19:25,768 ERROR [main] org.apache.sqoop.tool.BaseSqoopTool: Error parsing arguments for import:
2019-06-18 11:19:25,768 ERROR [main] org.apache.sqoop.tool.BaseSqoopTool: Unrecognized argument: 00:00:00"
2019-06-18 11:19:25,768 ERROR [main] org.apache.sqoop.tool.BaseSqoopTool: Unrecognized argument: --target-dir
2019-06-18 11:19:25,768 ERROR [main] org.apache.sqoop.tool.BaseSqoopTool: Unrecognized argument: /warehouse/external_data/sms/sb_subscribermacs
2019-06-18 11:19:25,768 ERROR [main] org.apache.sqoop.tool.BaseSqoopTool: Unrecognized argument: --as-textfile

如何强制sqoop使'last_value'值适合单行?

1 个答案:

答案 0 :(得分:1)

正如您发现的那样,当您使用命令元素时,Oozie会将每个空间上的命令拆分为多个参数。如果参数中有空格,例如最后一个值的日期,则应改用多个arg选项。就像这样:

<action name="sqoop_test" retry-max="${maxretry}" retry-interval="${retryinterval}">
    <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <arg>import</arg>
        <arg>--conect</arg>
        <arg>jdbc:mysql:loadbalance://sql01.sboxdc.com/mydb</arg>
        <!--All the other arguments...-->
        <arg>--last-value</arg>
        <arg>"${wf:actionData('get_last_modified_time')['last_modified_date']}</arg>
        <!--Other arguments...-->        
    </sqoop>
    <ok to="end"/>
    <error to="fail"/>
</action>