在不同情况下获取重复记录

时间:2019-05-12 13:15:14

标签: sql-server

我正在尝试编写查询以找出B列值不同的重复项。基本上我需要针对三种情况编写查询。

方案1:

我的要求是:“仅当组内B列中的所有值都应该不同并且在不同值中,COL B中的一个值应该为EXTERNAL时,我才需要COL A值作为输出”

我已经从旧帖子中查询到场景

SELECT test.colA
FROM test INNER JOIN (
    SELECT colA, COUNT(DISTINCT colB) AS cntUnique, COUNT(colB) AS cntAll 
    FROM test
    GROUP BY colA
)t ON test.colA = t.colA
GROUP BY test.colA
HAVING SUM(CASE WHEN colB = 'EXTERNAL' THEN 1 ELSE 0 END) = 1 
    AND MAX(t.cntUnique) = MAX(t.cntAll)

方案2:

我的要求是:“仅当B列中的值之一应为EXTERNAL时,才需要COL A值作为输出,但该组中的所有值不应相同且应该有其他值值存在于COLB中,而不是值EXTERNAL。

方案3:

需要选择不属于方案1和方案2的所有记录

我的样品记录及其所需的结果在下面给出

enter image description here

2 个答案:

答案 0 :(得分:0)

对于您的方案1,我认为这也将起作用:

SELECT colA
FROM test 
GROUP BY colA
HAVING 
  SUM(CASE WHEN colB = 'EXTERNAL' THEN 1 ELSE 0 END) = 1 
  AND 
  COUNT(DISTINCT ColB) = COUNT(*)

对于方案2,包含'EXTERNAL'的行数应大于0且小于总行数-对于colA的每个值均为1:

SELECT colA
FROM test 
GROUP BY colA
HAVING 
  SUM(CASE WHEN colB = 'EXTERNAL' THEN 1 ELSE 0 END) BETWEEN 1 AND COUNT(*) - 2 
  AND
  SUM(CASE WHEN colB <> 'EXTERNAL' THEN 1 ELSE 0 END) > 
  COUNT(DISTINCT ColB) - SUM(CASE WHEN colB = 'EXTERNAL' THEN 1 ELSE 0 END)

对于方案3,包含'EXTERNAL'的行数应为0或等于colA的每个值的总行数:

SELECT colA
FROM test 
GROUP BY colA
HAVING 
  SUM(CASE WHEN colB = 'EXTERNAL' THEN 1 ELSE 0 END) IN (0, COUNT(*))

请参见demo
结果:

场景 1

> | colA |
> | ---: |
> |  123 |
> |  131 |


场景 2

> | colA |
> | ---: |
> |  434 |
> |  567 |


场景 3

> | colA |
> | ---: |
> |  121 |
> |  345 |
> |  456 |

答案 1 :(得分:0)

“多合一”解决方案。必须通过SCENARIO_ID列对结果集进行过滤,以获取必要的数据。

with
  test as (
    select
      *
     from (
       values (123, 'EXTERNAL'), (123, 'INTERNAL'),
              (456, 'INTERNAL'), (456, 'IBM'), (456, 'DELL'),
              (345, 'EXTERNAL'), (345, 'EXTERNAL'), (345, 'EXTERNAL'),
              (434, 'INTERNAL'), (434, 'US'), (434, 'US'), (434, 'EXTERNAL'),
              (567, 'INTERNAL'), (567, 'EXTERNAL'), (567, 'EXTERNAL'), (567, 'IBM'),
              (121, 'INTERNAL'), (121, 'INTERNAL'), (121, 'INTERNAL'),
              (131, 'EXTERNAL'), (131, 'IBM')
     ) t(cola, colb)
  ),
  t as (
    select
      cola,
      count(*) qnt,
      count(distinct colb) distinct_qnt,
      sum(iif(colb = 'EXTERNAL', 1, 0)) external_qnt
    from test
    group by cola
  )
select
  cola,
  iif(external_qnt = 0 or external_qnt = qnt,
        3, iif(external_qnt = 1 and distinct_qnt = qnt, 1, 2)) scenario_id
from t;

输出:

+------+-------------+
| cola | scenario_id |
+------+-------------+
|  121 |           3 |
|  123 |           1 |
|  131 |           1 |
|  345 |           3 |
|  434 |           2 |
|  456 |           3 |
|  567 |           2 |
+------+-------------+

使用Rextester在线进行测试。