按日期划分Hive分区?

时间:2016-04-20 09:29:08

标签: sql apache-spark hive hiveql bigdata

我有一个像

这样的外部表格
CREATE EXTERNAL TABLE TAB(ID INT, NAME STRING) PARTITIONED BY(YEAR INT, MONTH STRING , DATES INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

我有像

这样的数据
/user/input/2015/jan/1;
/user/input/2015/jan/30
像2000年到2016年那样,每年有12个月零30天;

ALTER TABLE TAB ADD PARTITION(year = '2015', month = 'jan',dates = '5') LOCATION '/user/input/2015/jan/1';  

如果我这样做,我只得到一天的数据;

select * from TAB where (year = '2015', month = 'jan',dates = '5'); 

如果我跑

select * from TAB where (year = '2015', month = 'jan',dates = '6'); 

我没有收到任何数据。请帮我解释如何改变上述场景的表格

3 个答案:

答案 0 :(得分:1)

create table tab(id int,name string,dt string) partitioned by (year string,month string);

create table samp(id int,name string,dt string) row format delimited fields terminated by '\t';

load data inpath '\dir' into table samp;
insert overwrite table tab partition (y,m) select id,name dt,YEAR(dt),MONTH(dt) from samp;

答案 1 :(得分:0)

ALTER TABLE TAB ADD PARTITION(year = '2015', month = 'jan',dates = '5') LOCATION '/user/input/2015/jan/1';您需要1天,因为您在位置值中指定了1个文件

在5天内创建分区,如下所示

ALTER TABLE TAB 
ADD PARTITION(dates <= '5') 
LOCATION '/user/input/2015/jan/'; 

答案 2 :(得分:0)

alter table with all dates only选项,我跟着一样喜欢 &#34; ALTER TABLE TAB添加分区(年=&#39; 2015&#39;,月=&#39; jan&#39;,date =&#39; 5&#39;)LOCATION&#39; / user /输入/ 2015 / JAN / 1&#39 ;; &#34;