Question

在PostgreSQL中，我有一个类似下面的查询，它将从1米行表中删除250k行：

DELETE FROM table WHERE key = 'needle';

查询需要一个多小时才能执行，在此期间，受影响的行将被锁定以进行写入。这并不好，因为这意味着许多更新查询必须等待大删除查询完成（然后它们将失败，因为行从它们下面消失但是没关系）。我需要一种方法将这个大查询分成多个部分，以便尽可能减少对更新查询的干扰。例如，如果删除查询可以拆分为每个包含1000行的块，则其他更新查询最多必须等待涉及1000行的删除查询。

DELETE FROM table WHERE key = 'needle' LIMIT 10000;

该查询可以很好地工作，但是它在postgres中不存在。

Answer 1

尝试使用子选择并使用唯一条件：

DELETE FROM 
  table 
WHERE 
  id IN (SELECT id FROM table WHERE key = 'needle' LIMIT 10000);

Answer 2

为删除设置锁定级别，并更新为更精细的锁定模式。请注意，您的交易现在会更慢。

http://www.postgresql.org/docs/current/static/sql-lock.html

http://www.postgresql.org/docs/current/static/explicit-locking.html

Answer 3

Frak's answer很好，但这可能会更快，但需要8.4因为窗口函数支持（伪代码）：

result = query('select
    id from (
        select id, row_number(*) over (order by id) as row_number
        from mytable where key=?
    ) as _
    where row_number%8192=0 order by id', 'needle');
// result contains ids of every 8192nd row which key='needle'
last_id = 0;
result.append(MAX_INT); // guard
for (row in result) {
    query('delete from mytable
        where id<=? and id>? and key=?', row.id, last_id, 'needle');
    // last_id is used to hint query planner,
    // that there will be no rows with smaller id
    // so it is less likely to use full table scan
    last_id = row.id;
}

这是不成熟的优化 - 邪恶的事情。当心。

删除许多行而不锁定它们

3 个答案: