Question

我每天有大约5年的数据被添加。大约有1500万行数据，我们目前每天要增加几千行。

对于瓶颈的处理流水线，我正在考虑将一个物化视图替换为可以向其添加每日增量的表。在尝试对基准性能进行基准测试时，遇到了一个我无法理解的异常。

看来，刷新由简单的select语句定义的物化视图比使用等效的select语句插入表要快 heaps 。我用完了所有的假设。我希望有任何建议！

我有以下

create table rpb.question_parts
(
    id bigserial primary key,
    question_id bigint not null,
    constraint question_parts foreign key (question_id) references rpb.questions(id) on delete cascade on update cascade
);

create materialized view rpb.question_parts_mv as select * from rpb.question_parts with data;

create table rpb.question_parts_copy
(
    id bigint,
    question_id bigint
); --note that there's no constraints, so no checking

在基表（question_parts）中填充了大约500万行之后，我执行以下操作。...

refresh materialized view rpb.question_parts_mv; 
-- takes about 2.8 seconds

和

insert into rpb.question_parts_copy select * from rpb.question_parts;
-- takes about 9 seconds
-- if there are any constraints (even deferred ones) it takes about 100 seconds
-- if I drop and add constraints (e.g. pkey) then adding them takes only 3 seconds

所以结果完全让我感到困惑。

正如我所说，任何在这里有经验的人都可以真正帮助我理解。

为什么刷新的物化视图要比对select *查询的插入要慢得多？

0 个答案: