我正在尝试根据开始和结束日期以及配置单元中的值列将一个记录拆分为多个记录 下面是相同的输入
id startdate enddate value
1 01/02/2017 10/02/2017 1000
2 01/02/2019 02/02/2019 5000
样本输出
id startdate enddate value
1 01/02/2017 01/31/2017 100
1 02/02/2017 02/28/2017 100
1 03/02/2017 03/31/2017 100
1 04/02/2017 04/30/2017 100
1 05/02/2017 05/31/2017 100
1 06/02/2017 06/30/2017 100
1 07/02/2017 07/31/2017 100
1 08/02/2017 08/31/2017 100
1 09/02/2017 09/30/2017 100
1 10/02/2017 10/02/2017 100
2 01/02/2019 01/31/2019 2500
2 01/02/2019 02/02/2019 2500
我有一个表数据,其中包含列ID,开始日期,结束日期和值。对于每条记录,如果开始日期和结束日期之间相差10个月,则我尝试每月进行拆分,那么应将一条记录转换为10条记录。 我们如何在蜂巢中做到这一点?感谢帮助
答案 0 :(得分:0)
您可以尝试以下吗?
with dt_str as (select explode(split('01/02/2017,02/02/2017,03/02/2017,04/02/2017,05/02/2017,06/02/2017,07/02/2017,08/02/2017,09/02/2017,10/02/2017', ',')) as date1), -- this should be your date table where you should have all the dates
ip_rec as (select 1 as id, '01/02/2017' as start_dt, '10/02/2017' as end_dt, 1000 as value),
res1 as (select t1.*, t2.date1 from ip_rec t1, dt_str t2 where
from_unixtime(unix_timestamp(t2.date1, 'MM/dd/yyyy'), 'yyyyMMdd')>= from_unixtime(unix_timestamp(t1.start_dt, 'MM/dd/yyyy'), 'yyyyMMdd') and from_unixtime(unix_timestamp(t2.date1, 'MM/dd/yyyy'), 'yyyyMMdd')<= from_unixtime(unix_timestamp(t1.end_dt, 'MM/dd/yyyy'), 'yyyyMMdd')),
res2 as (select id, count(*) cnt from res1 group by id) -- get number of records for each id
select t1.id, t1.date1 as start_date, last_day(from_unixtime(unix_timestamp(t1.date1, 'MM/dd/yyyy'), 'yyyy-MM-dd')) as end_date, value/cnt from res1 t1 inner join res2 t2 on t1.id=t2.id;
结果-
OK
1 01/02/2017 2017-01-31 100.0
1 02/02/2017 2017-02-28 100.0
1 03/02/2017 2017-03-31 100.0
1 04/02/2017 2017-04-30 100.0
1 05/02/2017 2017-05-31 100.0
1 06/02/2017 2017-06-30 100.0
1 07/02/2017 2017-07-31 100.0
1 08/02/2017 2017-08-31 100.0
1 09/02/2017 2017-09-30 100.0
1 10/02/2017 2017-10-31 100.0
Time taken: 52.142 seconds, Fetched: 10 row(s)
希望这会有所帮助。