根据其他列的条件删除重复的行

时间:2018-07-06 13:13:27

标签: sql postgresql

我有一张这样的桌子:

ID       | CODE     | DATE
1        | 2398     | 2016-4-3   
1        | null     | 2015-8-3   
2        | 1942     | 2015-9-8   
3        | 6752     | 2013-2-1   
3        | 7217     | 2015-1-1   
4        | 9827     | 2011-2-9

“ ID”中有重复项,我想根据以下条件删除重复项行:

  1. 如果“ CODE”之一包含空值,请删除该空值。
  2. 如果两者都包含实际代码,请保留最新的日期。
  3. 如果两个均包含null,则保留最新日期。

所需的输出如下:

ID       | CODE     | DATE
1        | 2398     | 2016-4-3     
2        | 1942     | 2015-9-8      
3        | 7217     | 2015-1-1   
4        | 9827     | 2011-2-9

我知道根据一列删除重复项的方法:

WITH CTE AS
(
   SELECT *,
          RN = ROW_NUMBER() OVER(PARTITION BY COLUMN ORDER BY COLUMN)
   FROM dbo.YourTable
)
DELETE FROM CTE
WHERE RN > 1

但是我不知道如何添加条件,有人可以帮忙吗?

3 个答案:

答案 0 :(得分:0)

下面的查询的症结在于使用解析函数计算以下数量:

COUNT(*) OVER (PARTITION BY ID) - COUNT(CODE) OVER (PARTITION BY ID)

如果重复项只有一个NULL码,则此数量等于1。在大多数其他情况下,此数量可能是两个(两个代码NULL)或零(两个都不是NULL的代码,或者只有一个非NULL代码的一个)。

这使我们能够确定是从单个记录还是从重复记录中获取最新记录,还是仅保留一对重复记录中的非NULL代码。

WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE DESC) rn,
        COUNT(*) OVER (PARTITION BY ID) AS total_cnt,
        COUNT(CODE) OVER (PARTITION BY ID) id_cnt
    FROM yourTable
)

DELETE
FROM cte
WHERE
    (total_cnt - id_cnt <> 1 AND rn > 1) OR
    (total_cnt - id_cnt = 1 AND total_cnt > 1 AND CODE IS NULL);

Demo

答案 1 :(得分:0)

您只需要使用ORDER BY

WITH CTE AS (
     SELECT t.*,
            ROW_NUMBER() OVER (PARTITION BY COLUMN
                               ORDER BY (CASE WHEN Code IS NOT NULL THEN 1 ELSE 2 END),  -- valid codes first
                                         DATE DESC
                              ) as seqnum
     FROM dbo.YourTable t
    )
DELETE FROM CTE
WHERE seqnum > 1;

订单by指定的第一行将具有有效的代码(如果存在)以及最新日期。

答案 2 :(得分:0)

[Postgres不允许删除CTE]

只需从以下三种情况的编码开始:


Expo.FileSystem.readAsStringAsync(fileUri)

现在,您可以组合前两个条件 (甚至可能是 第三个​​)


DELETE FROM thistable d
WHERE code IS NULL
        AND EXISTS ( SELECT * FROM thistable x
        WHERE x.id = d.id AND x.code IS NOT NULL
        )
OR code IS NULL
        AND EXISTS ( SELECT * FROM thistable x
        WHERE x.id = d.id AND x.code IS NULL
        AND x.zdate > d.zdate
        )
OR code IS NOT NULL
        AND EXISTS ( SELECT * FROM thistable x
        WHERE x.id = d.id AND x.code IS NOT NULL
        AND x.zdate > d.zdate
        );