PostgreSQL使用select更新性能

时间:2013-06-10 15:00:01

标签: performance postgresql select sql-update

"名称"是一个或多或少100万行的表。 我已经尝试过这个请求,但它永远不会结束。有没有问题要避免" in" ?

update name 
   set name_val = true 
where name_pk in (select max (name_pk) 
                  from name 
                  group by foreign_key_pk);

如果有必要,我不反对触发器。

查询计划:

"Nested Loop  (cost=26073.59..26310.38 rows=200 width=54)"
"  ->  HashAggregate  (cost=26073.59..26075.59 rows=200 width=4)"
"        ->  HashAggregate  (cost=23122.82..24598.20 rows=118031 width=12)"
"              ->  Seq Scan on name  (cost=0.00..19956.21 rows=633321 width=12)"
"  ->  Index Scan using name_pk on name  (cost=0.00..1.16 rows=1 width=54)"
"        Index Cond: (public.name.name_pk = (max(public.name.name_pk)))"

2个索引:

CREATE INDEX link_name_foreign_key_pk
  ON name
  USING btree
  (foreign_key_pk);

CREATE UNIQUE INDEX name_pk
  ON name
  USING btree
  (name_pk);

感谢。

1 个答案:

答案 0 :(得分:2)

像这样创建一个multi-column index(很像评论中的@a_horse already suggested):

CREATE INDEX name_foo_id ON name (foreign_key_pk, name_pk DESC)

DESC只会稍快一些。 Postgres几乎可以快速向后扫描索引。但是对于多列索引可能会变得棘手。

并为UPDATE使用此替代语法:

UPDATE name n
SET    name_val = TRUE
FROM  (
    SELECT max(name_pk) AS max_pk
    FROM   name 
    GROUP  BY foreign_key_pk
  ) x
WHERE n.name_pk = x.max_pk
AND   name_val IS DISTINCT FROM TRUE;
对于较大的集合,

IN往往是最慢的解决方案。 JOIN应该更快。

额外的WHERE子句AND name_val IS DISTINCT FROM TRUE可以避免(昂贵的)空更新。

NOT EXISTS的反半连接可能也是表演王冠的竞争者:

UPDATE name n
SET    name_val = TRUE
WHERE  NOT EXISTS (
   SELECT 1
   FROM   name
   WHERE  foreign_key_pk = n.foreign_key_pk
   AND    name_pk > n.name_pk
   )
AND    name_val IS DISTINCT FROM TRUE;