Oozie sqoop hive导入:主类[org.apache.oozie.action.hadoop.SqoopMain],退出代码[1]

时间:2016-07-15 07:29:59

标签: cloudera sqoop oozie

我的工作流程看起来像这样:

<workflow-app xmlns="uri:oozie:workflow:0.2" name="oozie-sqoop">
  <start to="sqoop1" />
  <action name="sqoop1">
    <sqoop xmlns="uri:oozie:sqoop-action:0.4">
      <job-tracker>localhost:8032</job-tracker>
      <name-node>hdfs://quickstart.cloudera:8020</name-node>
      <arg>import</arg>
      <arg>--connect</arg>
      <arg>jdbc:mysql://8.8.8.8:3306/pro-data</arg>
      <arg>--username</arg>
      <arg>root</arg>
      <arg>--table</arg>
      <arg>data_source</arg>
      <arg>--hive-import</arg>
    </sqoop>
    <ok to="end" />
    <error to="fail" />
  </action>
  <kill name="fail">
    <message>sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end" />
</workflow-app>

它总是会遇到错误,上面写着:

  

主类[org.apache.oozie.action.hadoop.SqoopMain],退出代码[1]

我可以使用这个将数据导入hdfs如果我将--target-dir arg放到hdfs但是当我使用hive-import它不起作用时,我的xml有什么问题吗?

实际上我在这里使用oozie rest api。我的端点和数据如下所示

http://8.8.8.8:11000/oozie/v1/jobs?jobtype=sqoop

输入数据:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://quickstart.cloudera:8020</value>
    </property>
    <property>
        <name>mapred.job.tracker</name>
        <value>localhost:8032</value>
    </property>
    <property>
        <name>user.name</name>
        <value>cloudera</value>
    </property>

    <property>
        <name>oozie.sqoop.command</name>
        <value>
    import 
    --connect 
    jdbc:mysql://ip:3306/pro-data 
    --username 
    root  
    --table 
    data_source
         --hive-home 
    /user/cloudera/warehouse/
    -m
    1
    --incremental 
    append 
    --check-column
    id
    --hive-import 

        </value>
    </property>
<property>
      <name>oozie.libpath</name>
        <value>hdfs://quickstart.cloudera:8020/user/oozie/share/lib/lib_20160715181153/sqoop</value>
    </property>
      <property>
            <name>hcat.metastore.uri</name>
            <value>thrift://127.0.0.1:9083</value>
        </property>
   <property>
        <name>oozie.use.system.libpath</name>
        <value>True</value>
    </property>
    <property>
        <name>oozie.proxysubmission</name>
        <value>True</value>
    </property>
</configuration>

日志:

016-07-16 14:30:38,171  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] Start action [0000016-160716103436859-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:30:38,199  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] [***0000016-160716103436859-oozie-oozi-W@:start:***]Action status=DONE
2016-07-16 14:30:38,204  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] [***0000016-160716103436859-oozie-oozi-W@:start:***]Action updated in DB!
2016-07-16 14:30:38,475  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] Start action [0000016-160716103436859-oozie-oozi-W@sqoop1] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:31:17,880  INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] checking action, hadoop job ID [job_1468690384910_0024] status [RUNNING]
2016-07-16 14:31:17,887  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] [***0000016-160716103436859-oozie-oozi-W@sqoop1***]Action status=RUNNING
2016-07-16 14:31:17,887  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] [***0000016-160716103436859-oozie-oozi-W@sqoop1***]Action updated in DB!
2016-07-16 14:34:40,286  INFO CallbackServlet:520 - SERVER[quickstart.cloudera] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] callback for action [0000016-160716103436859-oozie-oozi-W@sqoop1]
2016-07-16 14:34:42,001  INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] checking action, hadoop job ID [job_1468690384910_0024] status [RUNNING]
2016-07-16 14:34:57,679  INFO CallbackServlet:520 - SERVER[quickstart.cloudera] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] callback for action [0000016-160716103436859-oozie-oozi-W@sqoop1]
2016-07-16 14:34:58,642  INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] action completed, external ID [job_1468690384910_0024]
2016-07-16 14:34:58,663  WARN SqoopActionExecutor:523 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
2016-07-16 14:34:58,987  INFO ActionEndXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] ERROR is considered as FAILED for SLA
2016-07-16 14:34:59,299  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] Start action [0000016-160716103436859-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:34:59,343  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] [***0000016-160716103436859-oozie-oozi-W@fail***]Action status=DONE
2016-07-16 14:34:59,349  INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] [***0000016-160716103436859-oozie-oozi-W@fail***]Action updated in DB!

纱线日志:

mapreduce.tasktracker.http.threads=40
dfs.stream-buffer-size=4096
tfile.fs.output.buffer.size=262144
fs.permissions.umask-mode=022
dfs.client.datanode-restart.timeout=30
dfs.namenode.resource.du.reserved=104857600
yarn.resourcemanager.am.max-attempts=2
yarn.nodemanager.resource.percentage-physical-cpu-limit=100
ha.failover-controller.graceful-fence.connection.retries=1
mapreduce.job.speculative.speculative-cap-running-tasks=0.1
dfs.datanode.drop.cache.behind.writes=false
hadoop.common.configuration.version=0.23.0
mapreduce.job.ubertask.enable=false
yarn.app.mapreduce.am.resource.cpu-vcores=1
dfs.namenode.replication.work.multiplier.per.iteration=2
mapreduce.job.acl-modify-job= 
io.seqfile.local.dir=${hadoop.tmp.dir}/io/local
yarn.resourcemanager.system-metrics-publisher.enabled=false
fs.s3.sleepTimeSeconds=10
mapreduce.client.output.filter=FAILED
------------------------

Sqoop command arguments :
             import
             --connect
             jdbc:mysql://172.16.1.18:3306/pro-data
             --username
             root
             --table
             data_source
             --hive-home
             /user/cloudera/warehouse/
             -m
             1
             --incremental
             append
             --check-column
             id
             --hive-import
Fetching child yarn jobs
tag id : oozie-a68d0f5f197314a14720c8ff3935b1dc
Child yarn jobs are found - 
=================================================================

>>> Invoking Sqoop command line now >>>

42238 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
42453 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.5.0
42572 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.BaseSqoopTool  - Using Hive-specific delimiters for output. You can override
42572 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.BaseSqoopTool  - delimiters with --fields-terminated-by, etc.
42685 [uber-SubtaskRunner] WARN  org.apache.sqoop.ConnFactory  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
43432 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.MySQLManager  - Preparing to use a MySQL streaming resultset.
43491 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.CodeGenTool  - Beginning code generation
45931 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
46198 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
46219 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
62817 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - Writing jar file: /tmp/sqoop-yarn/compile/78cb8ad53d1f0fe6f62c936c7688a4b8/data_source.jar
62926 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Maximal id query for free form incremental import: SELECT MAX(`id`) FROM `data_source`
62937 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Incremental import based on column `id`
62937 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Upper bound value: 45
62937 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - It looks like you are importing from mysql.
62937 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - This transfer can be faster! Use the --direct
62937 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - option to exercise a MySQL-specific fast path.
62937 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.MySQLManager  - Setting zero DATETIME behavior to convertToNull (mysql)
62979 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Beginning import of data_source
63246 [uber-SubtaskRunner] WARN  org.apache.sqoop.mapreduce.JobBase  - SQOOP_HOME is unset. May not be able to find all job dependencies.
65748 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.DBInputFormat  - Using read commited transaction isolation
Heart beat
Heart beat
Heart beat
148412 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Transferred 754 bytes in 85.1475 seconds (8.8552 bytes/sec)
148429 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Retrieved 9 records.
148464 [uber-SubtaskRunner] INFO  org.apache.sqoop.util.AppendUtils  - Appending to directory data_source
148520 [uber-SubtaskRunner] INFO  org.apache.sqoop.util.AppendUtils  - Using found partition 2
148685 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
148741 [uber-SubtaskRunner] WARN  org.apache.sqoop.hive.TableDefWriter  - Column created_date had to be cast to a less precise type in Hive
148741 [uber-SubtaskRunner] WARN  org.apache.sqoop.hive.TableDefWriter  - Column updated_date had to be cast to a less precise type in Hive
148743 [uber-SubtaskRunner] INFO  org.apache.sqoop.hive.HiveImport  - Loading uploaded data into Hive
Heart beat
Intercepting System.exit(1)

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://quickstart.cloudera:8020/user/cloudera/oozie-oozi/0000009-160719121646145-oozie-oozi-W/sqoop1--sqoop/action-data.seq

Oozie Launcher ends

0 个答案:

没有答案