删除具有引用同一个表的列的行会花费大量时间

时间:2009-05-26 15:17:39

标签: performance postgresql indexing constraints

很抱歉有一个非常具体的问题。

我有一个表(见下),当我尝试从中删除大量记录时,我的PostgreSQL 8.2.5花了98%的时间来做父子约束。 我想弄清楚我应该添加什么索引以使其快速进行。 我不得不说这个表上的所有列都有0或null作为parent_block_id:它是基本的。

我尝试添加不同的索引:just(parent_block_id); WHERE parent_block_id = 0; WHERE parent_block_id为NULL;在哪里,parent_block_id!= 0.这些都没有带来严重的性能优势。

varshavka=> explain analyze delete from infoblocks where template_id = 112;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Seq Scan on infoblocks  (cost=0.00..1234.29 rows=9 width=6) (actual time=13.271..40.888 rows=40000 loops=1)
   Filter: (template_id = 112)
 Trigger for constraint $1: time=4051.219 calls=40000
 Trigger for constraint $2: time=1616.194 calls=40000
 Trigger for constraint cs_ibrs: time=2810.144 calls=40000
 Trigger for constraint cs_ibct: time=4026.305 calls=40000
 Trigger for constraint cs_ibbs: time=3517.640 calls=40000
 Trigger for constraint cs_ibreq: time=774344.010 calls=40000
 Total runtime: 790760.168 ms
(9 rows)



varshavka=> \d infoblocks
                                      Table "public.infoblocks"
     Column      |            Type             |                      Modifiers
-----------------+-----------------------------+------------------------------------------------------
 id              | integer                     | not null default nextval(('IB_SEQ'::text)::regclass)
 parent_block_id | integer                     |
 nm_id           | integer                     | default 0
 template_id     | integer                     | not null
 author_id       | integer                     |
 birthdate       | timestamp without time zone | not null
Indexes:
    "infoblocks_pkey" PRIMARY KEY, btree (id)
    "zeroparent" btree (parent_block_id) WHERE parent_block_id <> 0
Foreign-key constraints:
    "$2" FOREIGN KEY (nm_id) REFERENCES newsmakers(nm_id) ON DELETE RESTRICT
    "$5" FOREIGN KEY (author_id) REFERENCES users(user_id) ON DELETE RESTRICT
    "cs_ibreq" FOREIGN KEY (parent_block_id) REFERENCES infoblocks(id) ON DELETE CASCADE

3 个答案:

答案 0 :(得分:2)

您是否尝试过向template_id添加索引?

答案 1 :(得分:2)

如果您可以暂时阻止其他人,可以删除约束cs_ibreq,执行删除,然后重新添加约束?

可能因为parent_block_id只有一个非空值,在检查约束时它没有使用索引?虽然这看起来有点奇怪。

答案 2 :(得分:2)

首先:在注意到丑陋的查询时间时,您应该做的第一件事(第0次!)确保您最近有VACUUM ANALYZE d。

如果您只需要一次性删除,请参阅araqnid's answer。但是如果你需要的东西将来会继续工作,而某些行有一个非零的非空parent_block_id字段,请继续阅读。

我猜测PostgreSQL没有将ON DELETE CASCADE引起的删除合并到一个查询中 - EXPLAIN输出显示这些作为触发器的事实表明每个子行删除都在事实上是分开进行的。据推测,每一行都可以在parent_block_id上使用索引查找找到,但这仍然会比通过表格的单次扫描慢得多。

因此,您可以通过将ON DELETE CASCADE更改为ON DELETE RESTRICT并手动编译需要在临时表中执行的所有删除的列表来获得大幅加速,然后将其全部删除一旦。 如果层次结构的最大深度很小,这种方法会非常快。这是一些伪代码:

# Insert the top-level rows as "seed" rows.
INSERT INTO rows_to_delete
    SELECT id, 0 FROM infoblocks WHERE template_id = 112

# Gather all rows that are children of any row at depth curLevel,
# advancing curLevel until no more children are found.
curLevel = 0
while (nRowsReturnedFromLastInsert > 0) {
    INSERT INTO rows_to_delete
        SELECT ib.id, rtd.level + 1
        FROM infoblocks ib
        JOIN rows_to_delete rtd ON (ib.parent_block_id = rtd.id)
        WHERE rtd.level = curLevel

    curLevel = curLevel + 1
}

DELETE FROM infoblocks
    JOIN rows_to_delete rtd ON (infoblocks.id = rtd.id)

(我不确定,但实际上您可能需要使用ON DELETE NO ACTION代替ON DELETE RESTRICT才能使最终DELETE成功 - 我不清楚是否单身当DELETE生效时,允许ON DELETE RESTRICT语句删除父级及其所有后代。如果由于某种原因这是不可接受的,您可以始终循环遍历多个DELETE语句,首先删除最下面的语句等级,然后是下一个最底层等等。)