Hive插入覆盖目录存储为镶木地板的NULL值

时间:2017-03-13 12:40:31

标签: hive

我试图在一个目录中添加一些数据,然后将这些数据作为分区添加到表中。

create table test (key int, value int) partitioned by (dt int) stored as parquet location '/user/me/test';
insert overwrite directory '/user/me/test/dt=1' stored as parquet select 123, 456, 1;
alter table test add partition (dt=1);
select * from test;

此代码示例很简单......但不起作用。使用select语句,输出为NULL,NULL,1。但我需要123,456,1。

当我用Impala阅读数据时,我收到了123,456,1 ...预期的结果。

为什么?有什么问题?

如果我删除了两个"存储为镶木地板",它一切都好......但我希望我的数据在实木复合地板中!

PS:我希望这个构造用于切换分区,以便在计算数据时,他们不会转到用户...

1 个答案:

答案 0 :(得分:1)

识别问题

<强>蜂房

create table test (key int, value int)
partitioned by (dt int) 
stored as parquet location '/user/me/test'
;

insert overwrite directory '/user/me/test/dt=1' 
stored as parquet 
select 123, 456
;

alter table test add partition (dt=1)
;

select * from test
;

+----------+------------+---------+
| test.key | test.value | test.dt |
+----------+------------+---------+
| NULL     | NULL       |       1 |
+----------+------------+---------+

<强>的bash

parquet-tools cat hdfs://{fs.defaultFS}/user/me/test/dt=1/000000_0 
_col0 = 123
_col1 = 456

验证问题

<强>蜂房

alter table test change column `key`    `_col0` int cascade;
alter table test change column `value`  `_col1` int cascade;

select * from test
;    

+------------+------------+---------+
| test._col0 | test._col1 | test.dt |
+------------+------------+---------+
|        123 |        456 |       1 |
+------------+------------+---------+

建议的解决方案

创建附加表test_admin并通过它插入

create table test_admin (key int, value int) 
partitioned by (dt int) 
stored as parquet location '/user/me/test'
;

create external table test (key int, value int) 
partitioned by (dt int) 
stored as parquet 
location '/user/me/test'
;

insert into test_admin partition (dt=1) select 123, 456
;

select * from test_admin
;

+----------+------------+---------+
| test.key | test.value | test.dt |
+----------+------------+---------+
|      123 |        456 |       1 |
+----------+------------+---------+

select * from test
;

(empty result set)

alter table test add partition (dt=1)
;

select * from test
;

+----------+------------+---------+
| test.key | test.value | test.dt |
+----------+------------+---------+
|      123 |        456 |       1 |
+----------+------------+---------+