找到日期时间为字符串时使用Impala拉近7天的最佳方法

时间:2018-11-28 13:34:59

标签: impala

我正在尝试使用我们开始提取的数据集,当然“ devicereceipttime”存储为字符串,我无法说服任何人立即进行更改。但是,“年”,“月”,“日”和“小时”被分解为单独的字段,作为“整数”。看起来如下:

devicereceipttime(string)   year(int)  month(int)  day(int)  hour(int)
2018-06-19T05:00:06.265Z    2018       6           19        5
2018-06-19T18:53:56.776Z    2018       6           19        6
2018-06-19T02:10:05.252Z    2018       6           19        2
2018-06-19T12:14:01.395Z    2018       6           19        12

我正在使用Impala,并希望运行与以下查询类似的查询,但仅使用“ devicereceipttime”字符串值或“ y / m / d”整数可以与以上类型一起使用的查询。我希望捕获整整一周的时间(连续7天),因此我可能会将报告安排在周六或周一在CDSW中运行。

当日期时间字符串格式为“ yyyy-mm-dd hh:mm:ss”时,我正在使用此查询

select *  
from winworkstations_realtime 
where devicereceipttime BETWEEN concat(to_date(now() - interval 1 days), " 00:00:00") and concat(to_date(now() - interval 8 days), " 24:00:00")

使用字符串还是尝试使用一堆int更好地解决问题?

1 个答案:

答案 0 :(得分:0)

我想出了这个来满足查询条件:

devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z")

select w.destinationhostname,w.destinationusername, w.destinationprocessname, count(*) as count \
from winworkstations_realtime w \
where w.devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z") AND w.externalid="4688" AND w.destinationhostname like "T%" AND (w.destinationusername not like "%$" AND w.destinationusername not like "LOCAL%" AND w.destinationusername not like "-") \
group by w.destinationhostname, w.destinationusername,w.destinationprocessname \
order by 1,2'