选择MySQL中最常出现的值

时间:2010-07-15 15:23:41

标签: mysql group-by greatest-n-per-group

我正在寻找一种方法来选择最常出现的值,例如:每个帖子发布最多的人;

SELECT MOST_OCCURRING(user_id) FROM thread_posts GROUP BY thread_id

有没有好办法呢?

3 个答案:

答案 0 :(得分:9)

如果你想在每个线程的基础上计数,我认为你可以使用嵌套查询;首先按线程分组,然后按用户分组:

SELECT thread_id AS tid,
    (SELECT user_id FROM thread_posts 
        WHERE thread_id = tid 
        GROUP BY user_id
        ORDER BY COUNT(*) DESC
        LIMIT 0,1) AS topUser
FROM thread_posts
GROUP BY thread_id

答案 1 :(得分:3)

这将列出每个线程的user_id的出现次数

SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id

但您只希望为每个帖子选择最高用户

SELECT thread_id, user_id, postings
FROM (
  SELECT thread_id, user_id, COUNT(*) as postings
  FROM thread_posts
  GROUP BY thread_id, user_id
)
HAVING postings = max(postings)

相当于

SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id
HAVING postings = max(postings)

HAVING关键字通常与聚合操作一起用于挑选满足HAVING子句条件的聚合输出行。

HAVING子句与WHERE子句不同,其中HAVING子句过滤查询的结果输出。然而,WHERE子句过滤查询的输入数据。 由于HAVING子句过滤了查询的结果输出,因此它必须出现在ORDER BY和GROUP BY子句之后。

答案 2 :(得分:2)

如果您检查“每组最大n个”标签下的问题,有很多例子。但在这种情况下,您没有定义如何处理关系 - 如果两个或更多用户具有相同的计数值会怎么样?

SELECT DISTINCT
       tp.thread_id,
       tp.user_id
  FROM THREAD_POSTS tp
  JOIN (SELECT t.thread_id,
               t.user_id,
               COUNT(t.user_id) AS occurrence,
               CASE
                 WHEN @thread != t.thread_id THEN @rownum := 1
                 ELSE @rownum := @rownum + 1
               END AS rank,
               @thread := t.thread_id
          FROM THREAD_POSTS t
          JOIN (SELECT @rownum := 0, @thread := -1) r
      GROUP BY t.thread_id, t.user_id
      ORDER BY t.thread_id, occurrence DESC) x ON x.thread_id = tp.thread_id
                                              AND x.user_id = tp.user_id
                                              AND x.rank = 1