删除重复的SQL表行

时间:2014-03-11 19:54:19

标签: sql sql-server

我的sql表没有主键列。我可以找到dublicate行,但我不知道我怎么可以删除一个休息;让我解释一下;

col1     col2    col3    col4
10       0       1000    1    
10       0       1000    1    --> should be deleted
10       0       1111    2    --> should be deleted
10       1       1000    1
10       2       1000    1
15       0       1000    1
15       0       1000    1    --> should be deleted
16       0       1000    1

我使用col1和col2来理解row是dublicate。同一时间" 10" " 0"必须是唯一的,但表可以包含许多" 10"或许多" 0"值。

感谢。

2 个答案:

答案 0 :(得分:2)

这应该有效。它首先找出每个组合有多少重复项,然后删除它们除了一个。

CREATE TABLE t_test (col1 int, col2 int, col3 int, col4 int)

INSERT t_test 
          SELECT 10, 0, 1000, 1 
UNION ALL SELECT 10, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 10, 0, 1111, 2 --> should be deleted
UNION ALL SELECT 10, 1, 1000, 1
UNION ALL SELECT 10, 2, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 16, 0, 1000, 1

DECLARE @col1 int, @col2 int, @count int

DECLARE delete_loop CURSOR LOCAL STATIC
    FOR SELECT COUNT(*), col1, col2
          FROM t_test
         GROUP BY col1, col2
        HAVING COUNT(*) > 1
OPEN delete_loop
FETCH NEXT FROM delete_loop INTO @count, @col1, @col2
WHILE @@FETCH_STATUS = 0
    BEGIN
        DELETE TOP (@count - 1)
          FROM t_test
         WHERE col1 = @col1
           AND col2 = @col2

        FETCH NEXT FROM delete_loop INTO @count, @col1, @col2
    END
CLOSE delete_loop
DEALLOCATE delete_loop

SELECT * FROM t_test

编辑:仅适用于col1和col2的唯一性。

答案 1 :(得分:0)

这是一种识别重复项并删除它们的简单方法。

为col1&组合的每个组合添加一个增量的id。 col2(分区依据),将其包装在CTE中,并删除不等于1的记录(第一次出现)。

DECLARE @Test TABLE (col1 int, col2 int, col3 int, col4 int)

INSERT @Test 
          SELECT 10, 0, 1000, 1 
UNION ALL SELECT 10, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 10, 0, 1111, 2 --> should be deleted
UNION ALL SELECT 10, 1, 1000, 1
UNION ALL SELECT 10, 2, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 16, 0, 1000, 1

;WITH DUPES
AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY COL1,COL2 ORDER BY COL1,COL4) AS myID
FROM @Test
)

DELETE D
FROM DUPES D
WHERE myID <> 1

SELECT * 
FROM @Test