我有一个包含以下数据的表:
dt device id count
2018-10-05 computer 7541185957382 6
2018-10-20 computer 7541185957382 3
2018-10-14 computer 7553187775734 6
2018-10-17 computer 7553187775734 10
2018-10-21 computer 7553187775734 2
2018-10-22 computer 7549187067178 5
2018-10-20 computer 7553187757256 3
2018-10-11 computer 7549187067178 10
我想获取每个dt
的最后一个id
。因此,我使用了如下窗口函数first_value和last_value:
select id,last_value(dt) over (partition by id order by dt) last_dt
from table
order by id
;
但是我收到此错误:
FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
Underlying error: Primitve type DATE not supported in Value Boundary expression
我无法诊断出问题,希望对您有所帮助。
答案 0 :(得分:1)
如果在查询中添加行之间子句,则查询将正常工作。
hive> select id,last_value(dt) over (partition by id order by dt
rows between unbounded preceding and unbounded following) last_dt
from table order by id;
结果:
+----------------+-------------+--+
| id | last_dt |
+----------------+-------------+--+
| 7541185957382 | 2018-10-20 |
| 7541185957382 | 2018-10-20 |
| 7549187067178 | 2018-10-22 |
| 7549187067178 | 2018-10-22 |
| 7553187757256 | 2018-10-20 |
| 7553187775734 | 2018-10-21 |
| 7553187775734 | 2018-10-21 |
| 7553187775734 | 2018-10-21 |
+----------------+-------------+--+
有Jira关于原始类型支持的信息,并已在 Hive.2.1.0
中得到修复更新:
对于不同的记录,您可以使用 ROW_NUMBER 窗口功能,并从结果集中仅过滤出first row
。
hive> select id,last_dt from
(select id,last_value(dt) over (partition by id order by dt
rows between unbounded preceding and unbounded following) last_dt,
ROW_NUMBER() over (partition by id order by dt)rn
from so )t
where t.rn=1;
结果:
+----------------+-------------+--+
| id | dt |
+----------------+-------------+--+
| 7541185957382 | 2018-10-20 |
| 7553187757256 | 2018-10-20 |
| 7553187775734 | 2018-10-21 |
| 7549187067178 | 2018-10-22 |
+----------------+-------------+--+