Question

我有一个非常简单的查询

UPDATE TableA
SET date_type = TableB.date_type
FROM TableB
WHERE TableB.int_type = TableA.int_type

我的索引是： TableA(int_type)，TableB(int_type, date_type)

EXPLAIN结果：

Update on TableA  (cost=2788789.320..34222368.900 rows=82594592 width=261)
  ->  Hash Join  (cost=2788789.320..34222368.900 rows=82594592 width=261)
          Hash Cond: (TableA.int_type = TableB.int_type)
        ->  Seq Scan on tableA  (cost=0.000..12610586.960 rows=101433296 width=247)
        ->  Hash  (cost=1272403.920..1272403.920 rows=82594592 width=18)
              ->  Seq Scan on TableB  (cost=0.000..1272403.920 rows=82594592 width=18)

查询进行了3个多小时。

如何使其运行更快？从EXPLAIN的结果可以看出，没有使用索引。我应该选择其他索引/进行其他改进以使查询运行更快吗？

PostgreSQL 9.6

Answer 1

对于此查询：

UPDATE TableA
SET date_type = TableB.date_type
FROM TableB
WHERE TableB.int_type = TableA.int_type

您可以尝试在TableB(int_type, date_type)上建立索引。

Answer 2

您可以做的是避免幂等更新：

UPDATE TableA a
SET date_type = b.date_type
FROM TableB b
WHERE b.int_type = a.int_type
AND a.date_type IS DISTINCT FROM b.date_type  -- <<-- avoid updates with the same value
        ;

而且，也许您假设A和B之间是一对一的关系，但DBMS却没有。您可以将更新限制为每个目标行最多一个源行：

EXPLAIN
UPDATE TableA a
SET date_type = b.date_type
FROM ( SELECT int_type, date_type
        , row_number() OVER(PARTITION BY int_type) AS rn
        FROM TableB
        ) b
WHERE b.int_type = a.int_type
AND a.date_type IS DISTINCT FROM b.date_type -- <<-- avoid idempotent updates
AND b.rn=1 -- <<-- allow only one update per target row.
        ;

提高UPDATE WHERE sql查询的性能

2 个答案: