Hiveql聚合:两个值之间的差异

时间:2017-08-17 15:55:15

标签: hive hiveql

我正在使用配备hadoop的配置单元。我正在查找一个功能(hiveql),它允许每天的最后/第一个值之间有差异。数据每5分钟记录一次(Gauge或为每个资源名称递增的计数器,并且我希望每个资源名称(mac)每天具有一个值的聚合。 illustration

1 个答案:

答案 0 :(得分:0)

使用分析函数first_valuelast_value

    select colldate, 
           max(last_success - first_success) as success,
           max(last_conmiss - first_conmiss) as conmiss,
           resourcename
    from
    (
    select 
         first_value(success) over(partition by resourcename, colldate order by colltime) first_success,
         last_value(success) over(partition by resourcename, colldate order by colltime) last_success,
         first_value(conmiss) over(partition by resourcename, colldate order by colltime) first_conmiss,
         last_value(conmiss) over(partition by resourcename, colldate order by colltime) last_conmiss,
    colldate, resourcename
    from (select s.*, to_date(s.colltime) colldate from table s) s
    )s
group by colldate, resourcename;

请参阅此处的文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics