Question

给定此数据库架构

CREATE TABLE a (
   datum            INTEGER NOT NULL REFERENCES datum_details(id),
   processed_datum  INTEGER REFERENCES datum_details(id),
   -- other columns
   PRIMARY KEY(datum)
);
CREATE INDEX a__processed ON a(processed_datum);

CREATE TABLE b (
   locale           TEXT NOT NULL,
   datum            INTEGER NOT NULL REFERENCES datum_details(id),
   -- other columns
   PRIMARY KEY (locale, datum)
);
CREATE INDEX b__datum ON b(datum);

我想这样做

INSERT INTO b (locale, datum)
   SELECT locale, datum FROM
      (SELECT DISTINCT processed_datum FROM a
          WHERE processed_datum IS NOT NULL) x(datum),
      -- array on next line is an example value for a query parameter
      (SELECT UNNEST(ARRAY['us', 'uk', 'cn', 'jp']) y(locale)
   EXCEPT
      SELECT locale, datum FROM b;

表a中有5,532,592行，非空processed_datum，并且提供给UNNEST的数组中有10到20个条目。上面INSERT的真实版本运行了将近一个小时。

当然有更快的方法吗？五千万个2元组应该完全适合RAM，并且不应该花费超过几秒的时间来排序或执行设置差异。

这是一个EXPLAIN转储，如果有帮助：

 Insert on b  (cost=65284579.58..68540566.70 rows=144531000 width=36)
   ->  Subquery Scan on "*SELECT*"  (cost=65284579.58..68540566.70 rows=144531000 width=36)
         ->  SetOp Except  (cost=65284579.58..67095256.70 rows=144531000 width=24)
               ->  Sort  (cost=65284579.58..65888138.62 rows=241423616 width=24)
                     Sort Key: "*SELECT* 1".locale, "*SELECT* 1".datum
                     ->  Append  (cost=1187015.87..6914612.59 rows=241423616 width=24)
                           ->  Subquery Scan on "*SELECT* 1"  (cost=1187015.87..4488192.27 rows=144531000 width=36)
                                 ->  Nested Loop  (cost=1187015.87..3042882.27 rows=144531000 width=36)
                                       ->  Unique  (cost=1187015.87..1221789.91 rows=1445310 width=4)
                                             ->  Sort  (cost=1187015.87..1204402.89 rows=6954808 width=4)
                                                   Sort Key: a.processed_datum
                                                   ->  Seq Scan on a (cost=0.00..111352.57 rows=6954808 width=4)
                                                         Filter: (processed_datum IS NOT NULL)
                                       ->  Materialize  (cost=0.00..2.01 rows=100 width=32)
                                             ->  Result  (cost=0.00..0.51 rows=100 width=0)
                           ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..2426420.32 rows=96892616 width=7)
                                 ->  Seq Scan on b b_1  (cost=0.00..1457494.16 rows=96892616 width=7)

编辑：在work_mem，shared_buffers和effective_cache_size分别达到512MB，2GB和4GB之后，这里有一个EXPLAIN ANALYZE：

Insert on b  (cost=44887364.38..48178884.98 rows=146154400 width=36) (actual time=3914166.825..3914166.825 rows=0 loops=1)
  ->  Subquery Scan on "*SELECT*"  (cost=44887364.38..48178884.98 rows=146154400 width=36) (actual time=3685048.686..3788573.419 rows=5693368 loops=1)
        ->  SetOp Except  (cost=44887364.38..46717340.98 rows=146154400 width=24) (actual time=3685048.682..3774876.094 rows=5693368 loops=1)
              ->  Sort  (cost=44887364.38..45497356.58 rows=243996880 width=24) (actual time=3217525.970..3615432.663 rows=127088841 loops=1)
                    Sort Key: "*SELECT* 1".locale, "*SELECT* 1".datum
                    Sort Method: external merge  Disk: 2733064kB
                    ->  Append  (cost=128839.10..5891963.33 rows=243996880 width=24) (actual time=20301.567..785480.666 rows=127088841 loops=1)
                          ->  Subquery Scan on "*SELECT* 1"  (cost=128839.10..3446545.73 rows=146154400 width=36) (actual time=20301.565..166376.863 rows=22130368 loops=1)
                                ->  Nested Loop  (cost=128839.10..1985001.73 rows=146154400 width=36) (actual time=20301.561..119491.552 rows=22130368 loops=1)
                                      ->  HashAggregate  (cost=128839.10..143454.54 rows=1461544 width=4) (actual time=20301.492..30209.211 rows=5532592 loops=1)
                                            ->  Seq Scan on a  (cost=0.00..111421.99 rows=6966842 width=4) (actual time=0.047..8712.474 rows=6963036 loops=1)
                                                  Filter: (processed_datum IS NOT NULL)
                                                  Rows Removed by Filter: 46118
                                      ->  Materialize  (cost=0.00..2.01 rows=100 width=32) (actual time=0.001..0.005 rows=4 loops=5532592)
                                            ->  Result  (cost=0.00..0.51 rows=100 width=0) (actual time=0.044..0.052 rows=4 loops=1)
                          ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..2445417.60 rows=97842480 width=7) (actual time=93.293..356716.139 rows=104958473 loops=1)
                                ->  Seq Scan on b b_1  (cost=0.00..1466992.80 rows=97842480 width=7) (actual time=93.288..134378.648 rows=104958473 loops=1)
Trigger for constraint b_datum_fkey: time=112020.669 calls=5693368
Trigger for constraint b_processed_fkey: time=12847.165 calls=5693368
Trigger for constraint b_detail_fkey: time=13007.502 calls=5693368
Total runtime: 4072066.380 ms

总耗时超过一小时。我不是100％理解这些数字的含义，但看起来大部分成本都在顶层排序操作中，而且这个：

 Sort Method: external merge  Disk: 2733064kB

...根本没有任何意义。 5000万个2元组，每个元组由一个四字节整数和一个两到四个字符的字符串组成，应该将400 兆字节的顺序序列化到磁盘上以便进行排序。我可以想象可能有两个开销的因素，但不是七。

EDIT.2：这里是Clockwork-Muse建议的“反连接”技术的EXPLAIN（ANALYZE，BUFFERS）。它更快，但只有大约27％ - 不足以称之为解决（“解决”将是在这种规模下不到一分钟运行的构造;理想情况下，不到一秒钟）。请注意，由于插入了零行，这就伪造了Clodoaldo Neto的建议，即瓶颈是索引更新。在我看来，它仍然将大部分时间花在外部排序和合并上。

Insert on b  (cost=42686194.35..45105977.45 rows=115655360 width=36) (actual time=2948649.994..2948649.994 rows=0 loops=1)
  Buffers: shared hit=103168 read=451882, temp read=277204 written=289345
  ->  Merge Anti Join  (cost=42686194.35..45105977.45 rows=115655360 width=36) (actual time=2948649.984..2948649.984 rows=0 loops=1)
        Merge Cond: (((unnest('{us,uk,cn,jp}'::text[])) = exc.locale) AND (a.processed_datum = exc.datum))
        Buffers: shared hit=103168 read=451882, temp read=277204 written=289345
        ->  Sort  (cost=25512243.57..25873666.57 rows=144569200 width=36) (actual time=494618.787..546275.100 rows=22130368 loops=1)
              Sort Key: (unnest('{us,uk,cn,jp}'::text[])), a.processed_datum
              Sort Method: external merge  Disk: 367752kB
              Buffers: shared hit=4 read=41286, temp read=45973 written=45973
              ->  Nested Loop  (cost=128828.23..1964858.83 rows=144569200 width=36) (actual time=21022.002..122392.505 rows=22130368 loops=1)
                    Buffers: shared read=41286
                    ->  HashAggregate  (cost=128828.23..143285.15 rows=1445692 width=4) (actual time=21021.642..30776.200 rows=5532592 loops=1)
                          Buffers: shared read=41286
                          ->  Seq Scan on a  (cost=0.00..111403.46 rows=6969909 width=4) (actual time=0.057..8959.278 rows=6963036 loops=1)
                                Filter: (processed_datum IS NOT NULL)
                                Rows Removed by Filter: 46118
                                Buffers: shared read=41286
                    ->  Materialize  (cost=0.00..2.01 rows=100 width=32) (actual time=0.001..0.005 rows=4 loops=5532592)
                          ->  Result  (cost=0.00..0.51 rows=100 width=0) (actual time=0.338..0.344 rows=4 loops=1)
        ->  Materialize  (cost=17173950.78..17704603.74 rows=106130592 width=7) (actual time=1769582.325..2204210.714 rows=105119249 loops=1)
              Buffers: shared hit=103164 read=410596, temp read=231231 written=243372
              ->  Sort  (cost=17173950.78..17439277.26 rows=106130592 width=7) (actual time=1769582.298..1978474.757 rows=105119249 loops=1)
                    Sort Key: exc.locale, exc.datum
                    Sort Method: external merge  Disk: 1946944kB
                    Buffers: shared hit=103164 read=410596, temp read=231231 written=243372
                    ->  Seq Scan on b exc  (cost=0.00..1575065.92 rows=106130592 width=7) (actual time=91.383..158841.821 rows=110651841 loops=1)
                          Buffers: shared hit=103164 read=410596
Total runtime: 2949399.213 ms

Answer 1

您可能能够通过异常 join 做得更好，本质上（DB2将此作为实际的连接类型，其他人必须使用此表单）：

INSERT INTO b (locale, datum)
SELECT y.locale, x.datum 
FROM (SELECT DISTINCT processed_datum 
      FROM a
      WHERE processed_datum IS NOT NULL) x(datum)
CROSS JOIN (SELECT UNNEST(ARRAY['us', 'uk', 'cn', 'jp'])) y(locale)
LEFT JOIN b Exception
       ON Exception.locale = y.locale
          AND Exception.datum = x.datum
WHERE Exception.locale IS NULL

或者也许是相关的表格：

INSERT INTO b (locale, datum)
SELECT locale, datum 
FROM (SELECT DISTINCT processed_datum 
      FROM a
      WHERE processed_datum IS NOT NULL) x(datum)
CROSS JOIN (SELECT UNNEST(ARRAY['us', 'uk', 'cn', 'jp'])) y(locale)
WHERE NOT EXISTS(SELECT 1
                 FROM b Exception
                 WHERE Exception.locale = y.locale
                       AND Exception.datum = x.datum)

Answer 2

Clockwork-Muse 建议的not exists变体经常优于其他变体。

但我想大部分费用都在primary key和b__datum索引同时构建插入。一次构建索引会更便宜。尝试在插入之前删除主键和索引

alter table b drop constraint b_pkey;
drop index b__datum;

然后在插入之后再插入构建索引：

alter table b add primary key (locale, datum);
create index b__datum on b(datum);

在交易中执行此操作，以便rollback如果失败且重复primary key

begin;
alter table b drop constraint b_pkey;
drop index b__datum;

insert ...

alter table b add primary key (locale, datum);
create index b__datum on b(datum);
commit; -- or rollback if it fails

Answer 3

CREATE TABLE a
   ( datum            INTEGER NOT NULL -- REFERENCES datum_details(id)
   , processed_datum  INTEGER -- REFERENCES datum_details(id)
   -- other columns
   , PRIMARY KEY(datum)
);
CREATE INDEX a__processed ON a(processed_datum);

CREATE TABLE b (
   locale           TEXT NOT NULL
   , datum            INTEGER NOT NULL -- REFERENCES datum_details(id)
   -- other columns
   , PRIMARY KEY (locale, datum)
);

-- this index  might help more than the original single-field index
CREATE UNIQUE INDEX b__datum ON b(datum, locale);

-- some test data
INSERT INTO a(datum, processed_datum)
SELECT gs, gs * 5
FROM generate_series(1,1000) gs
        ;

UPDATE a SET processed_datum = NULL WHERE random() < 0.5 ;

-- some test data
INSERT INTO b (datum,locale)
SELECT gs, 'us'
FROM generate_series(1,1000) gs
        ;
UPDATE b SET locale = 'uk' WHERE random() < 0.5;
UPDATE b SET locale = 'cn' WHERE random() < 0.4;
UPDATE b SET locale = 'jp' WHERE random() < 0.3;

VACUUM ANALYZE a;
VACUUM ANALYZE b;

-- EXPLAIN ANALYZE
INSERT INTO b (locale, datum)
   SELECT y.locale, x.datum FROM
      (SELECT DISTINCT processed_datum  AS datum
        FROM a
          WHERE a.processed_datum IS NOT NULL
        ) AS x
      , (SELECT UNNEST(ARRAY['us', 'uk', 'cn', 'jp']) AS locale
        ) AS y
      WHERE NOT EXISTS (
                SELECT * FROM b nx
                WHERE nx.locale = y.locale
                AND nx.datum = x.datum
                )
        ;

注意：除索引外，非常相似至@ clockwork-muse的答案。

必须有一个更快的方法（批量插入，如果尚未存在，postgres）

3 个答案: