OOZIE工作流程:HIVE表不存在,但目录在HDFS中创建

时间:2016-03-06 13:51:41

标签: hadoop mapreduce hive cloudera oozie

我正在尝试使用HIVE工作流程运行OOZIE操作。以下是蜂巢行动:

create table abc (a INT);

我可以在HDFS中找到内部表(在abc下创建目录/user/hive/warehouse)但是当我从SHOW TABLES触发命令hive>时,我无法见表。

这是workflow.xml文件:

<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
 <start to="hiveac"/>
 <action name="hiveac">
    <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <!-- <prepare> <delete path="${nameNode}/user/${wf:user()}/case1/out"/> </prepare> -->
        <!-- <job-xml>hive-default.xml</job-xml>-->
            <configuration>
                <property>
                    <name>oozie.hive.defaults</name>
                    <value>hive-default.xml</value>
                </property>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>script.q</script>
            <!-- <param>INPUT=/user/${wf:user()}/case1/sales_history_temp4</param>
            <param>OUTPUT=/user/${wf:user()}/case1/out</param> -->
        </hive>
  <ok to="end"/>
  <error to="fail"/>
 </action>
   <kill name="fail">
   <message>Pig Script failed!!!</message>
   </kill>
   <end name="end"/>
</workflow-app>

这是hive-default.xml文件:

<configuration>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://localhost/metastore</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.EmbeddedDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hiveuser</value>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>password</value>
</property>
<property>
  <name>datanucleus.autoCreateSchema</name>
  <value>false</value>
</property>
<property>
  <name>datanucleus.fixedDatastore</name>
  <value>true</value>
</property>
<property>
  <name>hive.stats.autogather</name>
  <value>false</value>
</property>
</configuration>

这是job.properties文件:

nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
oozie.libpath=/user/oozie/shared/lib
#oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/my/jobhive

日志没有给出任何错误:

stderr logs

Logging initialized using configuration in jar:file:/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/distcache/3179985539753819871_-620577179_884768063/localhost/user/oozie/shared/lib/hive-common-0.9.0-cdh4.1.1.jar!/hive-log4j.properties
Hive history file=/tmp/mapred/hive_job_log_mapred_201603060735_17840386.txt
OK
Time taken: 9.322 seconds
Log file: /var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/training/jobcache/job_201603060455_0012/attempt_201603060455_0012_m_000000_0/work/hive-oozie-job_201603060455_0012.log  not present. Therefore no Hadoop jobids found

我遇到了类似的帖子:Tables created by oozie hive action cannot be found from hive client but can find them in HDFS

但这并没有解决我的问题。请告诉我如何解决此问题。

1 个答案:

答案 0 :(得分:0)

我没有使用Oozie几个月(并且由于法律原因没有保存档案),无论如何它是V4.x所以这有点猜测......

  1. 将有效的hive-site.xml上传到某处的HDFS
  2. 告诉Oozie在运行Hive类之前在Launcher Configuration中注入所有这些属性,以便它继承所有这些属性, <job-xml>/some/hdfs/path/hive-site.xml</job-xml>
  3. 删除对oozie.hive.defaults
  4. 的任何引用

    警告:所有假设您的沙盒群集都有持久元数据 - 即您的hive-site.xml未指向每次都会被删除的Derby嵌入式数据库!