hive使用同一个表的信息计算时间戳差异

时间:2018-02-15 19:06:43

标签: hive hiveql

我有以下数据:

a.tb_name   a.start_time    a.end_time  a.createdon total_time_min  a.modifiedon    a.byuser
_2a 2018-02-15 13:17:04.795              2/15/2018          smt
_2a                         2018-02-15 13:19:11.11      2/15/2018           smt

开始& endtime是字符串(我可以使用任何数据类型)。 我需要的输出是以分钟为单位的总时间:

a.tb_name   a.start_time    a.end_time               a.createdon    total_time_min  a.modifiedon    a.byuser
_2a 2018-02-15 13:17:04.795                          2/15/2018           smt
_2a                         2018-02-15 13:19:11.11   2/15/2018           smt
_2a 2018-02-15 13:17:04.795 2018-02-15 13:19:11.11   2/15/2018         2 smt       

我试过这个,但由于时间数据都在两个不同的列中,我无法得到结果。

select  minute(cast(end_time as TIMESTAMP) - cast(start_time as TIMESTAMP)) from a where tb_name like "_2a";

1 个答案:

答案 0 :(得分:0)

我使用了窗口函数max,min和row_number。希望能帮助到你。感谢。

Query:
select tb_name,
start_time,
end_time,
'' as total_time_min,
createdon,
byuser 
from t2
union all
select tb_name,
max_start_time as start_time,
max_end_time as end_time,
cast((minute(cast(max_end_time as TIMESTAMP) - cast(max_start_time as TIMESTAMP))) as string) as total_time_min,
createdon,
byuser 
from (
select tb_name, 
start_time,
end_time,
max(start_time) over (partition by tb_name) as max_start_time,
max(end_time) over (partition by tb_name) as max_end_time,
row_number() over (partition by tb_name) as rnk,
createdon,
byuser 
from t2
) t1 where t1.rnk=1;

Result:
tb_name start_time              end_time            total_time_min  createdon       byuser
_2a     2018-02-15 13:17:04.795                                     20180215        smt
_2a                             2018-02-15 13:19:11.11              20180215        smt
_2a     2018-02-15 13:17:04.795 2018-02-15 13:19:11.11  2           20180215        smt