获取每个ID的前3个值

时间:2018-07-18 08:50:40

标签: sql database hiveql

我试图找出是否有可能使用HIVEQL根据每个ID的计数拉高前3个值。在下面的输入和输出中是这样的:

我是否特别需要进行一些内部联接,并希望得到一些提示

Input
[id]        [word]  [count]
B000JMLBHU  book    89
B000JMLBHU  read    83
B000JMLBHU  was     76
B000JMLBHU  story   54
B000R93D4Y  with    69
B000R93D4Y  book    61
B000R93D4Y  story   60
B000R93D4Y  was     57
B000R93D4Y  have    53
B001892DGG  was     68
B001892DWA  was     73
B001BXNQ2O  was     119
B001BXNQ2O  book    59
B001H55R8M  was     56
B001HQHCBQ  was     93
B001HQHCBQ  story   75
B001HQHCBQ  bella   61
B001HQHCBQ  with    59
B001HQHCBQ  love    58
B001HQHCBQ  zsadist 53


Output
[id]        [word]  [count]
B000JMLBHU  book    89
B000JMLBHU  read    83
B000JMLBHU  was     76
B000R93D4Y  with    69
B000R93D4Y  book    61
B000R93D4Y  story   60
B001892DGG  was     68
B001892DWA  was     73
B001BXNQ2O  was     119
B001BXNQ2O  book    59
B001H55R8M  was     56
B001HQHCBQ  was     93
B001HQHCBQ  story   75
B001HQHCBQ  bella   61

5 个答案:

答案 0 :(得分:2)

您可以使用row_number()函数:

select t.*
from (select *, row_number() over (partition by id order by count desc) as seq
      from table 
     ) t
where seq <= 3;

答案 1 :(得分:1)

为清楚起见,此答案特定于Hiveql,适用于Mysql 8 +

您可以使用common table expressionwindow函数rank来获得每个ID的前3个结果

WITH cte AS(
    SELECT *, 
    RANK() OVER (PARTITION BY id ORDER BY count DESC ) rnk
    FROM your_table
    ORDER BY id
)

SELECT *
FROM cte
WHERE  rnk <= 3;

答案 2 :(得分:0)

您可以尝试使用ROW_NUMBER函数并将其包含在where子句ROW_NUMBER <3中。

答案 3 :(得分:0)

您可能正在寻找LIMIT子句:

  SELECT id
       , word
       , count
    FROM whatever
ORDER BY count DESC
   LIMIT 3
       ;

请参见MySql参考中的this section

答案 4 :(得分:0)

您可以在DESC的计数列上使用ORDER BY,并使用LIMIT 3,这就是您将获得3个最高值的方法