在MS SQL Server中,我试图从带有空值的表中删除重复项。呻吟声。很多NULL
个。最重要的是我需要保留任何重复记录的一个副本,有或没有NULL
s。我基本上希望NULL
在操作期间像一个值为“NULL
”的普通记录,然后再回到真正的NULL
。这可能吗?有更简单的解决方案吗?
Table1
看起来像:
UID Data1 Data2
1 A NULL
2 A NULL
3 B abc
4 B abc
5 C NULL
6 D ghj
我希望命令丢弃第2行和第4行并保留其余部分。 (SELECT用于测试。)
;SELECT UID, Data1, Data2
FROM Table1 AS T
WHERE NOT EXISTS (
SELECT 1
FROM table1 AS T2
WHERE
T2.Data1 = T.Data1
AND T2.Data2 = T.Data2
AND T2.UID >= T.UID
)
AND Data1 IS NOT NULL
注意:SELECT DISTINCT不起作用,因为重复项具有不同的时间戳。
答案 0 :(得分:3)
这应该做:
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID)
FROM table1
)
DELETE
--SELECT *
FROM CTE
WHERE RN > 1
更新后的评论
好的,如果您在删除该行数时遇到问题,那么可以尝试创建一个包含您要删除的ID的查找表,然后进行批量删除(您将但是必须测试批次行数量。这是一个想法(假设UID
是pk):
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID)
FROM table1
)
SELECT [UID]
INTO RowsToDelete
FROM CTE
WHERE RN > 1;
CREATE INDEX I_UID ON RowsToDelete([UID]);
WHILE 1=1
BEGIN
DELETE TOP (10000)
FROM table1 T
INNER JOIN RowsToDelete L
ON T.[UID] = L.[UID]
IF @@ROWCOUNT < 10000 BREAK;
END
答案 1 :(得分:0)
SELECT DISTINCT Data1, Data2 FROM Table1
会不够?
答案 2 :(得分:0)
试试这个
;WITH uTable AS (
SELECT UID, Data1, Data2, ROW_NUMBER() OVER (PARTITION BY Data1,Data2 ORDER BY UID DESC) as rownum
FROM Table1 AS T)
SELECT UID, Data1, Data2
FROM uTable
WHERE rownum = 1
答案 3 :(得分:0)
我的解决方案:
declare @data TABLE (UID int, Data1 char(1), Data2 Char(3))
-- Your example data
INSERT INTO @data (UID, Data1, Data2)
VALUES (1,'A',NULL),(2,'A',NULL),(3,'B','abc'),(4,'B','abc'),(5,'C',NULL),(6,'D','ghj')
DELETE FROM @data WHERE UID in (
SELECT UID FROM (
SELECT UID, ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID) as RowNo FROM @data
) d WHERE d.rowNo>1
)
SELECT UID, Data1, Data2 FROM @data