在SQL中删除忽略空值的部分重复项

时间:2019-02-05 17:27:47

标签: sql duplicates sql-delete

我有一个带有两个可能的唯一标识符(ID1和ID2)的表。每行将具有一个或两个这些标识符。对于每个ID,每行中的数据都完全相同,但时间戳除外。我想从每个值中消除重复,但将空值视为唯一值。

这个问题: How to delete duplicate rows in sql server?

已将我推荐给此网站: http://www.codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/

我提出以下查询的地方:

WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID1 ORDER BY ID1) AS RN
FROM Filings_Search
)

DELETE FROM CTE WHERE RN<>1

不幸的是,这也删除了我所有的空值!如何修改此查询以避免删除空值?

编辑: 这是我的数据示例(如果有人知道如何很好地格式化表格,请告诉我。我使用了https://senseful.github.io/text-table/)。

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | abc  | macd     | 04:23     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
| 123  | NULL | wendys   | 03:45     |
+------+------+----------+-----------+

在ID1上运行将输出:

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | abc  | macd     | 04:23     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
+------+------+----------+-----------+

在ID2上运行将输出:

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
| 123  | NULL | wendys   | 03:45     |
+------+------+----------+-----------+

很抱歉,如果这是重复的话,我是SQL初学者,找不到与我要找的东西完全一样的东西。

2 个答案:

答案 0 :(得分:0)

有关:

 DELETE FROM CTE 
 WHERE RN<>1
   AND ID1 IS NOT NULL

答案 1 :(得分:0)

使用ID2,并将数据分区为

   WITH CTE AS (
        SELECT f.*, ROW_NUMBER() OVER (PARTITION BY  ID2,data ORDER BY Timestamp ) AS RN
        FROM Filings_Search 
    )
    DELETE FROM CTE WHERE RN<>1