我想根据他们所做的评论数量来分组用户数。
[User]: ID
[Comment]: ID, UserID
因此,如果user A has made 1 comment, user B has made 1 comment and user C has made 2 comments
,则输出为:
0 comments => 0 users
1 comment => 2 users (A+B)
2 comments => 1 user (C)
您如何查询?
答案 0 :(得分:3)
这取决于您的特定数据库结构,但我们假设您有一个用户表和一个注释表:
users table:
id: serial
name: text
comments table:
id: serial
user_id: integer (foreign key to the users table)
comment: text
您可以计算每个用户使用此查询进行的评论数量:
SELECT users.id, users.name, count(comments.id) as comment_cnt
FROM users LEFT JOIN
comments ON users.id = comments.user_id
GROUP BY users.id, users.name
然后,您可以在嵌套查询中使用此查询的结果来计算每个注释数的用户数:
SELECT comment_cnt, count(*) FROM
(SELECT users.id, users.name, count(comments.id) as comment_cnt
FROM users LEFT JOIN
comments ON users.id = comments.user_id
GROUP BY users.id, users.name) AS comment_cnts
GROUP BY comment_cnt;
我不知道任何优雅的方法来填补给定数量的评论没有用户的空白,但临时表和另一层嵌套工作:
CREATE TABLE wholenumbers (num integer);
INSERT INTO wholenumbers VALUES (0), (1), (2), (3), (4), (5), (6);
SELECT num as comment_cnt, COALESCE(user_cnt,0) as user_cnt
FROM wholenumbers
LEFT JOIN (SELECT comment_cnt, count(*) AS user_cnt
FROM ( SELECT users.id, users.name, count(comments.id) AS comment_cnt
FROM users LEFT JOIN comments ON users.id = comments.user_id
GROUP BY users.id, users.name) AS comment_cnts
GROUP BY comment_cnt) AS user_cnts ON wholenumbers.num = user_cnts.comment_cnt
ORDER BY num;
答案 1 :(得分:2)
以表格布局@ClaytonC provided:
为基础WITH cte AS (
SELECT msg_ct, count(*) AS users
FROM (
SELECT count(*) AS msg_ct
FROM comments
GROUP BY user_id
) sub
GROUP BY 1
)
SELECT msg_ct, COALESCE(users, 0) AS users
FROM generate_series(0, (SELECT max(msg_ct) FROM cte)) msg_ct
LEFT JOIN cte USING (msg_ct)
ORDER BY 1;
首先,计算每位用户的评论(msg_ct
)。只要外键强制执行引用完整性,不就需要加入users
表来聚合每个用户的注释。只计算comments
中的行数
接下来,计算每个邮件计数的用户数(users
)。
我在CTE中执行此操作,因为我在最终查询中使用了两次派生表。
首先generate_series()
生成从最小到最大动态的所有计数,包括间隙
然后将表格LEFT JOIN
并获得最终结果。
计数从0开始(在我更新后)。如果您希望以最小的实际msg_ct
开头,请在编辑历史记录中考虑我的答案的初稿。
解释基础知识的密切相关答案:
正如@ClaytonC评论的那样,上面的答案不包括没有评论的用户。
要解决此问题(如果您确实需要它),请在开始之后将LEFT JOIN加到users
:
WITH cte AS (
SELECT msg_ct, count(*) AS users
FROM (
SELECT count(c.user_id) AS msg_ct
FROM users u
LEFT JOIN comments c ON c.user_id = u.id
GROUP BY u.id
) sub
GROUP BY 1
)
SELECT ...
或,因为加入仅用于查找没有评论的用户,我们可能更便宜:计算所有用户并减去用户评论(无论如何我们处理过):
WITH cte AS (
SELECT msg_ct, count(*)::int AS users
FROM (
SELECT count(*)::int AS msg_ct
FROM comments
GROUP BY user_id
) sub
GROUP BY 1
)
, agg AS (
SELECT max(msg_ct) AS max_ct -- maximum for generate_series
,((SELECT count(*) FROM users) - sum(users))::int AS users
-- quiet rest with 0 comments
FROM cte
)
SELECT 0 AS msg_ct, users FROM agg -- users with 0 comments
UNION ALL
SELECT msg_ct, COALESCE(users, 0)
FROM (SELECT generate_series(1, max_ct) AS msg_ct FROM agg) g
LEFT JOIN cte USING (msg_ct)
ORDER BY 1;
查询变得有点复杂,但对于大表来说可能更快。不确定。使用EXPLAIN ANALYZE
进行测试(我将非常感谢对结果的评论。)