我正在使用配备hadoop的配置单元。我正在查找一个功能(hiveql),它允许每天的最后/第一个值之间有差异。数据每5分钟记录一次(Gauge或为每个资源名称递增的计数器,并且我希望每个资源名称(mac)每天具有一个值的聚合。 illustration
答案 0 :(得分:0)
使用分析函数first_value
和last_value
:
select colldate,
max(last_success - first_success) as success,
max(last_conmiss - first_conmiss) as conmiss,
resourcename
from
(
select
first_value(success) over(partition by resourcename, colldate order by colltime) first_success,
last_value(success) over(partition by resourcename, colldate order by colltime) last_success,
first_value(conmiss) over(partition by resourcename, colldate order by colltime) first_conmiss,
last_value(conmiss) over(partition by resourcename, colldate order by colltime) last_conmiss,
colldate, resourcename
from (select s.*, to_date(s.colltime) colldate from table s) s
)s
group by colldate, resourcename;
请参阅此处的文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics