Hive Query以获取表中插入的最后一条记录

时间:2017-04-14 19:01:11

标签: hive hiveql

 a                date             time         b
35573407        20170412        140930  310260453908912
35573407        20170412        140930  310260453908912
35573407        20170412        141054  310260453908912
35573407        20170412        025339  310260453908912
35573407        20170412        072918  310260453908912
35573407        20170412        091105  310260453908912
35573422        20170412        193605  310260453908912
35573407        20170412        121105  310260453908912
35573407        20170412        032439  310260453908912
35573407        20170412        032605  310260453908912

我试图找出一个hive查询,它使用b在表中插入最后一条记录。记录需要按时间列排序并获取最后一条记录。对于上述记录中的假设

35573422  20170412  193605  310260453908912

是最后一张唱片。

2 个答案:

答案 0 :(得分:2)

select  a,date,time,b

from   (select  *
               ,row_number() over 
                (
                    partition by    b
                    order by        date desc
                                   ,time desc
                ) as rn

        from    mytable
        ) t

where   t.rn = 1
+----------+----------+--------+-----------------+
|   a      |   date   |  time  |      b          |
+----------+----------+--------+-----------------+
| 35573422 | 20170412 | 193605 | 310260453908912 |
+----------+----------+--------+-----------------+

答案 1 :(得分:0)

请使用以下内容,因为它会更快,并且可以应用于子查询下推,以防我们使用WITH子句。

select  t.group_col, t.struct_val.col1 as date, t.struct_val.col2 as time
from (select  group_col, max(struct(date,time)) as struct_val from mytable ) t