我试图找出是否有可能使用HIVEQL根据每个ID的计数拉高前3个值。在下面的输入和输出中是这样的:
我是否特别需要进行一些内部联接,并希望得到一些提示
Input
[id] [word] [count]
B000JMLBHU book 89
B000JMLBHU read 83
B000JMLBHU was 76
B000JMLBHU story 54
B000R93D4Y with 69
B000R93D4Y book 61
B000R93D4Y story 60
B000R93D4Y was 57
B000R93D4Y have 53
B001892DGG was 68
B001892DWA was 73
B001BXNQ2O was 119
B001BXNQ2O book 59
B001H55R8M was 56
B001HQHCBQ was 93
B001HQHCBQ story 75
B001HQHCBQ bella 61
B001HQHCBQ with 59
B001HQHCBQ love 58
B001HQHCBQ zsadist 53
Output
[id] [word] [count]
B000JMLBHU book 89
B000JMLBHU read 83
B000JMLBHU was 76
B000R93D4Y with 69
B000R93D4Y book 61
B000R93D4Y story 60
B001892DGG was 68
B001892DWA was 73
B001BXNQ2O was 119
B001BXNQ2O book 59
B001H55R8M was 56
B001HQHCBQ was 93
B001HQHCBQ story 75
B001HQHCBQ bella 61
答案 0 :(得分:2)
您可以使用row_number()
函数:
select t.*
from (select *, row_number() over (partition by id order by count desc) as seq
from table
) t
where seq <= 3;
答案 1 :(得分:1)
为清楚起见,此答案特定于Hiveql,适用于Mysql 8 +
您可以使用common table expression和window函数rank来获得每个ID的前3个结果
WITH cte AS(
SELECT *,
RANK() OVER (PARTITION BY id ORDER BY count DESC ) rnk
FROM your_table
ORDER BY id
)
SELECT *
FROM cte
WHERE rnk <= 3;
答案 2 :(得分:0)
您可以尝试使用ROW_NUMBER函数并将其包含在where子句ROW_NUMBER
<3中。
答案 3 :(得分:0)
您可能正在寻找LIMIT
子句:
SELECT id
, word
, count
FROM whatever
ORDER BY count DESC
LIMIT 3
;
请参见MySql参考中的this section。
答案 4 :(得分:0)
您可以在DESC的计数列上使用ORDER BY,并使用LIMIT 3,这就是您将获得3个最高值的方法