Hive分区为NULL

时间:2016-02-18 04:40:02

标签: null hive

当我尝试从另一个表A中将重写插入到hive表B中时,我看到有一个额外的分区为表B创建了NULL值。表A中的源是动态分区的。 我在下面得到了类似的内容

partition (ds=2015-08-21, source=null)
Loading partition {ds=2015-08-21, source=xxxxx}
Loading partition {ds=2015-08-21, source=xxxxx}

它不应该像分区一样创建(ds = 2015-08-21,source = null)。源表中没有记录在原始表中为null。 但是当我查看表分区时,它只显示两个有效的分区。

ds=2015-08-21, source=xxxxx
ds=2015-08-21, source=xxxxx

再次,当我尝试从表B(插入上面)插入另一个表C.它给出" FAILED:NullPointerException null "错误。 请帮忙 这里是查询:

 INSERT OVERWRITE TABLE AAAAAAAAAAAAAAAAAAAA
                   PARTITION (ds='${date_string}',source)
  SELECT
    f1,f2,f3,f4,f5,f6,f7,f8
    latest_item.source AS source
  FROM (
    SELECT
     f1,f2,f3,f4,f5,f6,f7,f8,
      top.col6 AS source
    FROM (
      SELECT
        customer_id,
        item_id,
        greatest_n(1, visit_date, order_nbr, base_item_id, cat, subcat, source) top_n
      FROM BBBBBBBBBBBBBBBBBB
      WHERE visit_date >= '${ds_window}'
        AND visit_date <= '${date_string}'
        AND source IS NOT NULL
        AND source != ''
      GROUP BY customer_id, item_id
    ) top_ns
    LATERAL VIEW explode(top_n) t AS top
  ) latest_item
  WHERE latest_item.customer_id IS NOT NULL
    AND latest_item.customer_id != ''
    AND latest_item.item_id IS NOT NULL
    AND latest_item.item_id != ''
    AND latest_item.days_since_last_purchase > 0
    AND latest_item.subcat IS NOT NULL
    AND latest_item.subcat != ''
    AND latest_item.source != ''
    AND latest_item.source IS NOT NULL

0 个答案:

没有答案