删除并识别SQL中的重复记录

时间:2014-12-15 18:43:44

标签: sql sql-server-2012

我在下面有一些示例记录,我需要使用CASE WHEN语句来删除和识别SQL中的重复记录。

Quantity    Values      Desc    event   ID
1               5       Blue    12550   577
1               5       bluee   12550   525
2               10      blu     12550   535

我想使用案例陈述来显示重复的指标,例如:

Dup_Quantity    Dup_Value   Dup_Desc  Quantity  Values   Desc   event   ID
Y       Y       N     1     5    Blue   12550   577
Y       Y       N     1     5    Bluee  12550   525 

但是,使用此脚本后,结果仍显示为:

Dup_Quantity    Dup_Value   Dup_Desc  Quantity  Values   Desc   event   ID
Y       Y       N     1     5    Blue   12550   577
Y       Y       N     1     5    Bluee  12550   525 
Y       N       N     2     10   Blu    12550   535


SELECT DISTINCT 
    CASE WHEN a.Quantity = b.Quantity THEN 'Y' ELSE 'N' END AS "Dup_Quantity",
    CASE WHEN a.Values = b.Values THEN 'Y' ELSE 'N' END AS "Dup_Value",
    CASE WHEN a.Desc = b.Desc THEN 'Y' ELSE 'N' END AS "Dup_Desc"
FROM Table1 a 
INNER JOIN Table1 b ON a.event = b.event 
WHERE (a.Quantity = b.Quantity OR a.Values = b.Values OR a.Desc = b.Desc)
    AND a.ID <> b.ID

基本上,ID为535的记录会显示在结果中。有人请给我一些指示吗?

2 个答案:

答案 0 :(得分:2)

SQL Fiddle

MS SQL Server 2012架构设置

CREATE TABLE Table1
    ([Quantity] int, [Values] int, [Desc] varchar(5), [event] int, [ID] int)
;

INSERT INTO Table1
    ([Quantity], [Values], [Desc], [event], [ID])
VALUES
    (1, 5, 'Blue', 12550, 577),
    (1, 5, 'bluee', 12550, 525),
    (2, 10, 'blu', 12550, 535)
;

查询1

SELECT
CASE WHEN (SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1.Quantity = t2.Quantity AND 
                  t1.ID <> t2.ID AND t1.event = t2.event) > 0
THEN 'Y' ELSE 'N' END AS Dup_Quantity,
CASE WHEN (SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1."Values" = t2."Values" AND 
                  t1.ID <> t2.ID AND t1.event = t2.event) > 0
THEN 'Y' ELSE 'N' END AS Dup_Value,
CASE WHEN (SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1."Desc" = t2."Desc" AND 
                  t1.ID <> t2.ID AND t1.event = t2.event) > 0
THEN 'Y' ELSE 'N' END AS Dup_Desc,
*
FROM Table1 t1
WHERE
(SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1.Quantity = t2.Quantity AND 
                 t1.ID <> t2.ID AND t1.event = t2.event) > 0
OR
(SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1."Values" = t2."Values" AND 
                 t1.ID <> t2.ID AND t1.event = t2.event) > 0
OR
(SELECT COUNT(*) 
          FROM Table1 t2 
           WHERE t1."Desc" = t2."Desc" AND 
                 t1.ID <> t2.ID AND t1.event = t2.event) > 0

<强> Results

| DUP_QUANTITY | DUP_VALUE | DUP_DESC | QUANTITY | VALUES |  DESC | EVENT |  ID |
|--------------|-----------|----------|----------|--------|-------|-------|-----|
|            Y |         Y |        N |        1 |      5 |  Blue | 12550 | 577 |
|            Y |         Y |        N |        1 |      5 | bluee | 12550 | 525 |

答案 1 :(得分:0)

您的查询返回:

Dup_Quantity    Dup_Value   Dup_Desc
     Y              Y          N      

但是我不知道你想做什么,正确的版本是:

SELECT 
    CASE WHEN a."Quantity" = b."Quantity" THEN 'Y' ELSE 'N' END AS "Dup_Quantity",
    CASE WHEN a."Values" = b."Values" THEN 'Y' ELSE 'N' END AS "Dup_Value",
    CASE WHEN a."Desc" = b."Desc" THEN 'Y' ELSE 'N' END AS "Dup_Desc",
    a.*
FROM Table1 a 
INNER JOIN Table1 b  ON b.event = a.event 
WHERE (a."Quantity" = b."Quantity" OR a."Values" = b."Values" OR a."Desc" = b."Desc")
    AND a.ID <> b.ID

如果您希望获得QuantityValuesDesc方面的重复行:

SELECT 
    a.*
FROM Table1 a 
INNER JOIN Table1 b  ON b.event = a.event 
WHERE (a."Quantity" = b."Quantity" AND a."Values" = b."Values" AND a."Desc" = b."Desc")
    AND a.ID <> b.ID