我有下表获取增量更新。我需要编写一个普通的Hive查询来合并具有相同键值的行和最近的值。
Key | A | B | C | Timestamp
K1 | X | Null | Null | 2015-05-03
K1 | Null | Y | Z | 2015-05-02
K1 | Foo | Bar | Baz | 2015-05-01
想要获得:
Key | A | B | C | Timestamp
K1 | X | Y | Z | 2015-05-03
答案 0 :(得分:0)
使用first_value()函数获取last not null值。需要连接排序键,因为last_value只能使用一个排序键。
演示:
select distinct
key,
first_value(A) over (partition by Key order by concat(case when A is null then '1' else '2' end,'_',Timestamp)desc) A,
first_value(B) over (partition by Key order by concat(case when B is null then '1' else '2' end,'_',Timestamp)desc) B,
first_value(C) over (partition by Key order by concat(case when C is null then '1' else '2' end,'_',Timestamp)desc) C,
max(timestamp) over(partition by key) timestamp
from
( ---------Replace this subquery with your table
select 'K1' key, 'X' a, Null b, Null c, '2015-05-03' timestamp union all
select 'K1' key, null a, 'Y' b, 'Z' c, '2015-05-02' timestamp union all
select 'K1' key, 'Foo' a, 'Bar' b, 'Baz' c, '2015-05-01' timestamp
)s
;
输出:
OK
key a b c timestamp
K1 X Y Z 2015-05-03