在PostgreSQL中,我有一个类似下面的查询,它将从1米行表中删除250k行:
DELETE FROM table WHERE key = 'needle';
查询需要一个多小时才能执行,在此期间,受影响的行将被锁定以进行写入。这并不好,因为这意味着许多更新查询必须等待大删除查询完成(然后它们将失败,因为行从它们下面消失但是没关系)。我需要一种方法将这个大查询分成多个部分,以便尽可能减少对更新查询的干扰。例如,如果删除查询可以拆分为每个包含1000行的块,则其他更新查询最多必须等待涉及1000行的删除查询。
DELETE FROM table WHERE key = 'needle' LIMIT 10000;
该查询可以很好地工作,但是它在postgres中不存在。
答案 0 :(得分:22)
尝试使用子选择并使用唯一条件:
DELETE FROM
table
WHERE
id IN (SELECT id FROM table WHERE key = 'needle' LIMIT 10000);
答案 1 :(得分:1)
为删除设置锁定级别,并更新为更精细的锁定模式。请注意,您的交易现在会更慢。
http://www.postgresql.org/docs/current/static/sql-lock.html
http://www.postgresql.org/docs/current/static/explicit-locking.html
答案 2 :(得分:0)
Frak's answer很好,但这可能会更快,但需要8.4因为窗口函数支持(伪代码):
result = query('select
id from (
select id, row_number(*) over (order by id) as row_number
from mytable where key=?
) as _
where row_number%8192=0 order by id', 'needle');
// result contains ids of every 8192nd row which key='needle'
last_id = 0;
result.append(MAX_INT); // guard
for (row in result) {
query('delete from mytable
where id<=? and id>? and key=?', row.id, last_id, 'needle');
// last_id is used to hint query planner,
// that there will be no rows with smaller id
// so it is less likely to use full table scan
last_id = row.id;
}
这是不成熟的优化 - 邪恶的事情。当心。