Hive使用oozie加载数据操作

时间:2014-11-21 12:53:11

标签: hadoop load hive oozie

我在oozie workflow.xml中定义了一个hive操作,从hdfs路径执行数据加载。但是,幸运的是它没有工作。同样的脚本用于在hive中创建文件。你们可以请参考我的workflow.xml,job.properties,脚本文件并纠正我,如果我犯了任何错误。任何帮助赞赏。提前谢谢。
    script.hql包含"加载数据inpath' /../ hdfs dir'进入表格测试;"

**workflow.xml**


<workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-wf">
    <start to="hive-action"/>

     <action name="hive-action">
              <hive xmlns="uri:oozie:hive-action:0.2">
              <job-tracker>${jobtracker}</job-tracker>
              <name-node>${namenode}</name-node>
              <job-xml>hive-site.xml</job-xml>
                  <configuration>

                       <property>
                             <name>mapred.job.queue.name</name>
                             <value>${queueName}</value>
                        </property>
                        <property>
                              <name>oozie.hive.defaults</name>
                              <value>${namenode}/</value>
                        </property>
                        <property>
                             <name>mapred.reduce.tasks</name>
                              <value>2</value>
                        </property>
                  </configuration>
              <script>script.hql</script>
              </hive>
              <ok to="end"/>
              <error to="fail"/>
    </action>
    <kill name="fail">
         <message>Hive failed with some error.please look into that[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>

    <end name="end"/>
</workflow-app>



job.properties
--------------

namenode=hdfs://namenodeipaddress:8020    
jobtracker=jobtrackeripaddress:8032    
queueName=default    
oozie.use.system.libpath=true    
oozie.libpath=${namenode}/user/oozie/share/lib    
oozie.wf.application.path=${namenode}/user/username/OozieScripts

请从oozie找到以下错误日志。

2014-11-24 11:07:43,984 INFO org.apache.oozie.servlet.CallbackServlet: SERVER[HOSTNAME] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000005-141121151044934-oozie-oozi-W] ACTION[0000005-141121151044934-oozie-oozi-W@hive-action] callback for action [0000005-141121151044934-oozie-oozi-W@hive-action]
2014-11-24 11:07:44,339 INFO org.apache.oozie.command.wf.ActionEndXCommand: SERVER[HOSTNAME] USER[USERNAME] GROUP[-] TOKEN[] APP[hive-wf] JOB[0000005-141121151044934-oozie-oozi-W] ACTION[0000005-141121151044934-oozie-oozi-W@hive-action] ERROR is considered as FAILED for SLA
2014-11-24 11:07:44,391 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[HOSTNAME] USER[USERNAME] GROUP[-] TOKEN[] APP[hive-wf] JOB[0000005-141121151044934-oozie-oozi-W] ACTION[0000005-141121151044934-oozie-oozi-W@fail] Start action [0000005-141121151044934-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2014-11-24 11:07:44,391 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[HOSTNAME] USER[USERNAME] GROUP[-] TOKEN[] APP[hive-wf] JOB[0000005-141121151044934-oozie-oozi-W] ACTION[0000005-141121151044934-oozie-oozi-W@fail] [***0000005-141121151044934-oozie-oozi-W@fail***]Action status=DONE
2014-11-24 11:07:44,391 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[HOSTNAME] USER[USERNAME] GROUP[-] TOKEN[] APP[hive-wf] JOB[0000005-141121151044934-oozie-oozi-W] ACTION[0000005-141121151044934-oozie-oozi-W@fail] [***0000005-141121151044934-oozie-oozi-W@fail***]Action updated in DB!

1 个答案:

答案 0 :(得分:0)

您必须定义一个hive-default.xml文件来执行oozie中的hive脚本,并且该文件已在workflow.xml中提及为

<property>
                         <name>mapred.job.queue.name</name>
                         <value>${queueName}</value>
                    </property>
                    <property>
                          <name>oozie.hive.defaults</name>
                          ***<value>/usr/foo/hive-0.6-default.xml</value>***
                    </property>
                    <property>
                         <name>mapred.reduce.tasks</name>
                          <value>2</value>
                    </property>

有关详细信息,请参阅Hive workflow model