如何根据配置单元中的结束日期将记录拆分为多个记录

时间:2019-04-26 15:22:40

标签: hive hiveql

我正在尝试根据开始和结束日期以及配置单元中的值列将一个记录拆分为多个记录 下面是相同的输入

id      startdate       enddate     value
1       01/02/2017      10/02/2017  1000
2       01/02/2019      02/02/2019  5000

样本输出

id      startdate       enddate     value
1       01/02/2017      01/31/2017  100
1       02/02/2017      02/28/2017  100
1       03/02/2017      03/31/2017  100
1       04/02/2017      04/30/2017  100
1       05/02/2017      05/31/2017  100
1       06/02/2017      06/30/2017  100
1       07/02/2017      07/31/2017  100
1       08/02/2017      08/31/2017  100
1       09/02/2017      09/30/2017  100
1       10/02/2017      10/02/2017  100
2       01/02/2019      01/31/2019  2500
2       01/02/2019      02/02/2019  2500

我有一个表数据,其中包含列ID,开始日期,结束日期和值。对于每条记录,如果开始日期和结束日期之间相差10个月,则我尝试每月进行拆分,那么应将一条记录转换为10条记录。 我们如何在蜂巢中做到这一点?感谢帮助

1 个答案:

答案 0 :(得分:0)

您可以尝试以下吗?

with dt_str as (select explode(split('01/02/2017,02/02/2017,03/02/2017,04/02/2017,05/02/2017,06/02/2017,07/02/2017,08/02/2017,09/02/2017,10/02/2017', ',')) as date1), -- this should be your date table where you should have all the dates

ip_rec as (select 1 as id, '01/02/2017' as start_dt, '10/02/2017' as end_dt, 1000 as value),

res1 as (select t1.*, t2.date1 from ip_rec t1, dt_str t2 where  
from_unixtime(unix_timestamp(t2.date1, 'MM/dd/yyyy'), 'yyyyMMdd')>= from_unixtime(unix_timestamp(t1.start_dt, 'MM/dd/yyyy'), 'yyyyMMdd') and from_unixtime(unix_timestamp(t2.date1, 'MM/dd/yyyy'), 'yyyyMMdd')<= from_unixtime(unix_timestamp(t1.end_dt, 'MM/dd/yyyy'), 'yyyyMMdd')),

res2 as (select id, count(*) cnt from res1 group by id) -- get number of records for each id

select t1.id, t1.date1 as start_date, last_day(from_unixtime(unix_timestamp(t1.date1, 'MM/dd/yyyy'), 'yyyy-MM-dd')) as end_date, value/cnt from res1 t1 inner join res2 t2 on t1.id=t2.id;

结果-

OK
1       01/02/2017      2017-01-31      100.0
1       02/02/2017      2017-02-28      100.0
1       03/02/2017      2017-03-31      100.0
1       04/02/2017      2017-04-30      100.0
1       05/02/2017      2017-05-31      100.0
1       06/02/2017      2017-06-30      100.0
1       07/02/2017      2017-07-31      100.0
1       08/02/2017      2017-08-31      100.0
1       09/02/2017      2017-09-30      100.0
1       10/02/2017      2017-10-31      100.0
Time taken: 52.142 seconds, Fetched: 10 row(s)

希望这会有所帮助。