从没有游标的SQL Server表中删除记录

时间:2009-08-03 14:57:10

标签: sql-server tsql delete-row

我试图有选择地从SQL Server 2005表中删除记录而不循环游标。该表可以包含许多记录(有时> 500,000),因此循环过慢。

数据:

ID, UnitID, Day, Interval, Amount

1   100     10   21        9.345

2   100     10   22        9.367

3   200     11   21        4.150

4   300     11   21        4.350

5   300     11   22        4.734

6   300     11   23        5.106

7   400     13   21       10.257

8   400     13   22       10.428

键是:ID,UnitID,Day,Interval。

在此示例中,我希望删除记录2,5和8 - 它们与现有记录(基于密钥)相邻。

注意:记录6不会被删除,因为一旦5消失,它就不再相邻了。

我问得太多了吗?

5 个答案:

答案 0 :(得分:4)

请参阅我的博客中的这些文章,了解性能详情:


下面查询的主要思想是我们应该从连续的间隔范围中删除所有偶数行。

也就是说,如果给定的(unitId, Day)我们有以下intervals

1
2
3
4
6
7
8
9

,我们有两个连续范围:

1
2
3
4

6
7
8
9

,我们应该删除每一行:

1
2 -- delete
3
4 -- delete

6
7 -- delete
8
9 -- delete

,以便我们得到:

1
3
6
8

请注意,“偶数行”在此处表示“甚至是每个范围ROW_NUMBER() s”,而不是“偶数interval的值”。

以下是查询:

DECLARE @Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)

INSERT INTO @Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO @Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO @Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO @Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO @Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO @Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO @Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO @Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO @Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO @Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO @Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO @Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO @Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO @Table VALUES (14, 400, 13, 29, 9.345)

;WITH   rows AS
        (
        SELECT  *,
                ROW_NUMBER() OVER
                (
                PARTITION BY
                        (
                        SELECT  TOP 1 qi.id AS mint
                        FROM    @Table qi
                        WHERE   qi.unitid = qo.unitid
                                AND qi.[day] = qo.[day]
                                AND qi.interval <= qo.interval
                                AND NOT EXISTS
                                (
                                SELECT  NULL
                                FROM    @Table t
                                WHERE   t.unitid = qi.unitid
                                        AND t.[day] = qi.day
                                        AND t.interval = qi.interval - 1
                                )
                        ORDER BY
                                qi.interval DESC
                        )
                ORDER BY interval
                ) AS rnm
        FROM    @Table qo
        )
DELETE
FROM    rows
WHERE   rnm % 2 = 0

SELECT  *
FROM    @table

<强>更新

这是一个更有效的查询:

DECLARE @Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)

INSERT INTO @Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO @Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO @Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO @Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO @Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO @Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO @Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO @Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO @Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO @Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO @Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO @Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO @Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO @Table VALUES (14, 400, 13, 29, 9.345)

;WITH    source AS
        (
        SELECT  *, ROW_NUMBER() OVER (PARTITION BY unitid, day ORDER BY interval) rn
        FROM    @Table
        ),
        rows AS
        (
        SELECT  *, ROW_NUMBER() OVER (PARTITION BY unitid, day, interval - rn ORDER BY interval) AS rnm
        FROM    source
        )
DELETE
FROM    rows
WHERE   rnm % 2 = 0

SELECT  *
FROM    @table

答案 1 :(得分:1)

我不认为你所要求的是可能的 - 但你可能能够接近。看起来你可以几乎通过查找带有自连接的记录来实现它:

SELECT t1.id
FROM
  table t1 JOIN table t2 ON (
    t1.unitid = t2.unitid AND
    t1.day = t2.day AND
    t1.interval = t2.interval - 1
  )

但问题是,那也会找到id = 6。但是,如果您从此数据创建临时表,它可能小于原始数据,因此使用游标扫描速度要快得多(以解决id = 6问题)。然后,您可以执行DELETE FROM table WHERE id IN (SELECT id FROM tmp_table)来删除行。

可能有一种方法可以修复没有光标的ID = 6问题,但如果是这样,我就看不到了。

答案 2 :(得分:0)

WHILE statement,它是光标的替代品。与table variables结合使用可能会让你在性能范围内做同样的事情。

答案 3 :(得分:0)

DECLARE @Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)

INSERT INTO @Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO @Table VALUES (2, 100, 10, 22, 9.367)
INSERT INTO @Table VALUES (3, 200, 11, 21, 4.150)
INSERT INTO @Table VALUES (4, 300, 11, 21, 4.350)
INSERT INTO @Table VALUES (5, 300, 11, 22, 4.734)
INSERT INTO @Table VALUES (6, 300, 11, 23, 5.106)
INSERT INTO @Table VALUES (7, 400, 13, 21, 10.257)
INSERT INTO @Table VALUES (8, 400, 13, 22, 10.428)

DELETE FROM @Table
WHERE ID IN (
  SELECT t1.ID
  FROM @Table t1
       INNER JOIN @Table t2 
            ON  t2.UnitID = t1.UnitID 
                AND t2.Day = t1.Day 
                AND t2.Interval = t1.Interval - 1
       LEFT OUTER JOIN @Table t3 
            ON  t3.UnitID = t2.UnitID 
                AND t3.Day = t2.Day 
                AND t3.Interval = t2.Interval - 1
  WHERE t3.ID IS NULL)

SELECT * FROM @Table

答案 4 :(得分:0)

Lieven是如此接近 - 它适用于测试集,但是如果我添加一些记录,它会开始错过一些。

我们不能使用任何奇数/偶数标准 - 我们不知道数据是如何下降的。

添加此数据并重试:

INSERT @Table VALUES (9,    100,     10,   23,        9.345)

INSERT @Table VALUES (10,   100,     10,   24,        9.367)

INSERT @Table VALUES (11,   100,     10,   25,        4.150)

INSERT @Table VALUES (12,   100,     10,   26,        4.350)

INSERT @Table VALUES (13,   300,     11,   25,        4.734)

INSERT @Table VALUES (14,   300,     11,   26,        5.106)

INSERT @Table VALUES (15,   300,     11,   27,       10.257)

INSERT @Table VALUES (16,   300,     11,   29,       10.428)