SQL GROUP BY不同的行

时间:2015-02-05 13:05:50

标签: sql postgresql

我有一个Postges数据库,其中包含一个非常长的表和3列,如下所示:

s_id | c_id | a_id
 1   |  1   |  2
 1   |  1   |  3
 1   |  3   |  15
 2   |  1   |  2
 2   |  2   |  23
 3   |  1   |  2
 3   |  3   |  16

我有一个查询,找到所有包含c_id 1和3的s_ids,返回它们及其计数:

SELECT s_id, COUNT(s_id) as matching_clusters 
FROM test 
WHERE c_id IN (1,3) 
GROUP BY s_id HAVING COUNT(c_id) >= 2 
ORDER BY matching_clusters DESC

我得到的是以下内容:

s_id | matching_clusters
 1   |         3
 3   |         2 

但是,我只想计算一次重复的c_id,这样结果应该是

s_id | matching_clusters
 1   |         2
 3   |         2 

有关如何执行此操作的任何建议?我以为我可以将DISTINCT粘贴到COUNT命令中,但这不起作用。我可以使用不同的c_id将表结果连接到表本身,但我不想重新运行查询,因为在此表上运行查询是非常昂贵的计算方法。

2 个答案:

答案 0 :(得分:1)

如果我理解正确,那么这将有效:

SELECT s_id, 2 as matching_clusters 
FROM test 
WHERE c_id IN (1,3) 
GROUP BY s_id
HAVING COUNT(c_id) >= 2 
ORDER BY matching_clusters DESC;

这可能是你想要的:

SELECT s_id, COUNT(DISTINCT c_id) as matching_clusters 
FROM test 
WHERE c_id IN (1,3) 
GROUP BY s_id
HAVING COUNT(DISTINCT c_id) = 2 
ORDER BY matching_clusters DESC;

请注意在distinct子句中使用having

答案 1 :(得分:-1)

试试这个: -

SELECT s_id, COUNT(DISTINCT s_id) as matching_clusters 
FROM test 
WHERE c_id IN (1,3) 
GROUP BY s_id HAVING COUNT(c_id) >= 2 
ORDER BY matching_clusters DESC