使用如下MySQL表:
id | colA | colB
...| 1 | 13
...| 1 | 13
...| 1 | 12
...| 1 | 12
...| 1 | 11
...| 2 | 78
...| 2 | 78
...| 2 | 78
...| 2 | 13
...| 2 | 13
...| 2 | 9
对于colA
中的每个值,我想找到colB
中N个最频繁的值。
N = 2的示例结果:
colA | colB
1 | 13
1 | 12
2 | 78
2 | 13
我能够使用以下方式获得colA
和colB
及其频率的所有唯一组合:
SELECT colA, colB, COUNT(*) AS freq FROM t GROUP BY colA, colB ORDER BY freq DESC;
示例结果:
colA | colB | freq
1 | 13 | 2
1 | 12 | 2
1 | 11 | 1
2 | 78 | 3
2 | 13 | 2
2 | 9 | 1
但是我很难为LIMIT
中的每个值而不是整个表应用colA
。
这基本上类似于How to select most frequent value in a column per each id group?,仅用于MySQL而不是PostgreSQL。
我目前正在使用MariaDB 10.1。
答案 0 :(得分:1)
使用窗口功能,如果可以的话:
R
请注意,根据您对待领带的方式,可能需要SELECT colA, colB, freq
FROM (SELECT colA, colB, COUNT(*) AS freq,
DENSE_RANK() OVER (PARTITION BY colA ORDER BY COUNT(*) DESC) as seqnum
FROM t
GROUP BY colA, colB
) ab
WHERE seqnum <= 2;
,DENSE_RANK()
或RANK()
。如果有5个ROW_NUMBER()
值具有最高的两个排名,那么colB
将返回全部五个。
如果只需要两个值,请使用DENSE_RANK()
。
答案 1 :(得分:0)
您可以为此使用几个CTE,例如:
WITH counts AS (
SELECT colA, colB, COUNT(*) AS freq FROM t GROUP BY colA, colB ORDER BY freq DESC
), most_freq AS (
SELECT colA, max(freq) FROM counts GROUP BY colA
)
SELECT counts.*
FROM counts
JOIN most_freq ON (counts.colA = most_freq.colA
AND counts.freq = most_freq.freq);