Hive sql排名超过id

时间:2017-12-17 22:09:16

标签: sql hive rank

我有这些数据。我想聚合它并在聚合数据上放置一个row_number。

| ID_1  | time| ID_2 |
a,        1,    36
a,        2,    36
a,        3,    45
a,        4,    65
b,        1,    75
b,        2,    35
b,        3,    35
b,        4,    76

所需的输出看起来像这样。

| ID_1  | ID_2 | Row_number |
a,        36,    1
a,        45,    2
a,        65,    3
b,        75,    1
b,        35,    2
b,        76,    3

我的尝试是使用此代码:

select
ID_1, ID_2,
row_number() over (partition by ID_1, ID_2 order by time desc) as Row_number
from table1

但是产量在:

| ID_1  | ID_2|  Row_number |
a,        36,    1
a,        36,    2
a,        45,    1
a,        65,    1
b,        75,    1
b,        35,    1
b,        35,    2
b,        76,    1

如果我最后使用group by,我会得到一些错误的时间错误。

1 个答案:

答案 0 :(得分:2)

您需要先按ID_1,ID_2进行分组,然后对其row_number()进行分组。

SELECT id_1,
       id_2,
       row_number()
         OVER (
           partition BY id_1
           ORDER BY time ) AS Row_number
FROM   (SELECT id_1,
               id_2,
               MAX(time) time
        FROM   table1
        GROUP  BY id_1,
                  id_2) b;  

DEMO