SQL Count(*)和Group By - 查找行之间的差异

时间:2009-07-22 20:27:32

标签: sql

下面是我编写的SQL查询,用于查找每个产品ID(proc_id)的总行数:

SELECT proc_id, count(*)
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
GROUP BY proc_id
ORDER BY proc_id;

以下是上述SQL查询的结果:

proc_id count(*)
01  626
02  624
03  626
04  624
05  622
06  624
07  624
09  624

请注意,proc_id ='01',proc_id ='03'和proc_id ='05'的总计数不同(不等于624行,因为其他proc_id)。

如何编写SQL查询以查找proc_id ='01',proc_id ='03'和proc_id ='05'与其他proc_id相比哪些proc_id行不同?

6 个答案:

答案 0 :(得分:14)

首先,您需要定义使'624'正确的标准。它是count(*)的平均值吗?是最常出现的count(*)吗?它是你最喜欢的count(*)吗?

然后,您可以使用HAVING子句分隔与您的条件不匹配的子句:

SELECT proc_id, count(*)
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
GROUP BY proc_id
HAVING count(*) <> 624
ORDER BY proc_id;

或:

SELECT proc_id, count(*)
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
GROUP BY proc_id
HAVING count(*) <> (
  <insert here a subquery that produces the magic '624'>
 )
ORDER BY proc_id;

答案 1 :(得分:2)

如果你知道624是神奇的数字:

SELECT proc_id, count(*)
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
GROUP BY proc_id
HAVING count(*) <> 624
ORDER BY proc_id;

答案 2 :(得分:0)

试试这个:

SELECT proc_id, count(*)
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
GROUP BY proc_id
HAVING count(*) <> (select count(*) from proc z where proc_id in (1) group by proc_id)
ORDER BY proc_id;

答案 3 :(得分:0)

你不能这样做。对于某些procId,ProcId的行数较少。换句话说,使procId不具有count = 624的行是不存在的行。任何查询如何显示这些行?

对于行数太多的ProcIds,IF(这个很大,如果),如果624中其他procId的所有行都有一些属性与624个过大的集合子集相同,那么你可能能够识别“额外”行,但是没有办法识别丢失的行,你所能做的只是确定哪些行有太多行或太少......

答案 4 :(得分:0)

如果我理解你的问题(与其他发布的答案不同),你想要使proc_id 01不同的吗?如果是这种情况,您需要加入应该相同的所有列,并查找差异。所以,比较01和02:

 SELECT [01].*
 FROM (
    SELECT * FROM proc
    WHERE grouping_primary = 'SLB'
    AND eff_date = '01-JUL-09'
    AND proc_id = '01'
 ) as [01]
 FULL JOIN (
    SELECT * FROM proc
    WHERE grouping_primary = 'SLB'
    AND eff_date = '01-JUL-09'
    AND proc_id = '02'
 ) as [02] ON
    [01].col1 = [02].col1
    AND [01].col2 = [02].col2
    AND [01].col3 = [02].col3
    /* etc...just don't include proc_id */
 WHERE
    [01].proc_id IS NULL --no match in [02]
    OR [02].proc_id IS NULL --no match in [01]

我很确定MS Sql Server有一个行哈希函数,如果你有一堆列可能会更容易...但我想不出它的名字。

答案 5 :(得分:0)

好吧,为了找到额外的,你会使用NOT IN短语。要找到缺失的行,您需要反转逻辑。这自然假设从proc_id到proc_id的所有624行都是相同的。

SELECT proc_id, varying_column 
FROM proc
WHERE grouping_primary = 'SLB'
AND   eff_date = '01-JUL-09'
AND   varying_column NOT IN (SELECT b.varying_column 
                             FROM proc b
                             WHERE b.grouping_primary = 'SLB'
                             AND   b.eff_date = '01-JUL-09'
                             AND   b.proc_id = (SELECT FIRST a.proc_id
                                                FROM proc a
                                                WHERE a.grouping_primary = 'SLB'
                                                AND   a.eff_date = '01-JUL-09'
                                                AND   COUNT(a.*) = 624
                                                GROUP BY a.proc_id
                                                ORDER BY a.proc_id;))
ORDER BY proc_id, varying_column;