如何编写高效的UPDATE-SELECT sql

时间:2015-03-16 09:10:09

标签: sql postgresql sql-update

我为一个有大约50.000.000个用户的表编写了一个sql。查询花费的时间比我预期的要花费大约23个小时。

UPDATE users
    SET building_id = B.id
    FROM (
      SELECT *
      FROM buildings B
    ) AS B
    WHERE B.city          = address_city
      AND B.town          = address_town
      AND B.neighbourhood = address_neighbourhood
      AND B.street        = address_street
      AND B.no            = address_building_no

这个sql的想法是从用户删除建筑物/地址信息,而不是将其引用到建筑物表。

EXPLAIN

Update on users  (cost=22226900.43..22548054.14 rows=15212 width=166) 
->  Merge Join  (cost=22226900.43..22548054.14 rows=15212 width=166)
         Merge Cond: (((users.address_city)::text = (b.city)::text) AND ((users.address_town)::text = (b.town)::text) AND ((users.address_neighbourhood)::text = (b.neighbourhood)::text) AND ((users.address_street)::text = (b.street)::text) AND ((users.address_building_no)::text = (b.no)::text))
         ->  Sort  (cost=21352886.76..21401078.96 rows=96384398 width=156)
               Sort Key: users.address_city, users.address_town, users.address_neighbourhood, users.address_street, users.address_building_no
               ->  Seq Scan on users  (cost=0.00..2559921.19 rows=96384398 width=156)
         ->  Materialize  (cost=874013.68..883606.86 rows=9593179 width=63)
               ->  Sort  (cost=874013.68..878810.27 rows=9593179 width=63)
                     Sort Key: b.city, b.town, b.neighbourhood, b.street, b.no
                     ->  Seq Scan on buildings b  (cost=0.00..136253.54 rows=9593179 width=63) (10 rows)

我不知道这个sql是否为每个用户或缓存使用内部SELECT sql进行事务处理。此外,如果它缓存,它是否使用缓存临时表的索引?

我无法像这样编写sql:

FROM (
  SELECT * 
  FROM buildings B
  WHERE B.city          = users.address_city
    AND B.town          = users.address_town
    AND B.neighbourhood = users.address_neighbourhood
    AND B.street        = users.address_street
    AND B.no            = users.address_building_no
  )

它表示无法从内部选择访问users。您是否有任何建议如何在内部sql语句中访问建筑物。

2 个答案:

答案 0 :(得分:1)

我认为

create table t as select column_list from a join b on column=column;
alter table t rename to users;

会更快,并且只会产生微秒锁...... 当然,如果表目前不可编辑,并且temp_tablespace中有足够的空间

答案 1 :(得分:0)

不确定但不会更快(至少稍微,如果不是很大)?

UPDATE users
SET building_id = B.id
FROM buildings B
WHERE B.city          = address_city
  AND B.town          = address_town
  AND B.neighbourhood = address_neighbourhood
  AND B.street        = address_street
  AND B.no            = address_building_no

如果不出意外,它就不需要上面Materialize中给出的EXPLAIN阶段。