Question

我有一个性能问题的功能：

totalCharge := 0;
FOR myRecord IN ... LOOP
    ......
    IF severalConditionsAreMet THEN
        BEGIN

            SELECT t1.charge INTO STRICT recordCharge
            FROM t1
            WHERE t1.id = myRecord.id AND otherComplexConditionsHere;

            totalCharge := totalCharge + recordCharge;

            ...........
        EXCEPTION
             WHEN OTHERS THEN 
                 NULL;
        END;
    END IF;

END LOOP;

该函数被调用232次（不计算访问FOR的代码的次数）。来自FOR LOOP的IF最终被访问4466次，并且需要561秒来完成所有4466次迭代。

对于我所拥有的特定数据集，始终访问IF，上面的SELECT从不返回数据，代码每次都到达EXCEPTION分支。我已将代码更改为：

totalCharge := 0;
FOR myRecord IN ... LOOP
    ......
    IF severalConditionsAreMet THEN

        SELECT t1.charge INTO recordCharge
        FROM t1
        WHERE t1.id = myRecord.id AND otherComplexConditionsHere;

        IF (recordCharge IS NULL) THEN
            CONTINUE;
        END IF;

        totalCharge := totalCharge + recordCharge;

        ...........

    END IF;

END LOOP;

请注意，对于表t1，t1.charge列上定义了NOT NULL条件。这次，来自IF的代码需要1-2秒才能完成所有4466次迭代。

基本上，我所做的只是替换

BEGIN
…
EXCEPTION
….
END;

使用

IF conditionIsNotMet THEN
    CONTINUE;         
END IF;

有人可以向我解释为什么这有效吗？幕后发生了什么？我怀疑当你在LOOP中捕获异常并且代码最终生成异常时，Postgres不能使用缓存计划来优化该代码，因此它最终会在每次迭代时规划代码，这会导致性能问题。我的假设是否正确？

稍后编辑：

我改变了Vao Tsun提供的例子来反映我想要说明的案例。

CREATE OR REPLACE FUNCTION initialVersion()
RETURNS VOID AS $$
declare
  testDate DATE;
begin
  for i in 1..999999 loop
    begin
    select now() into strict testDate where 1=0;
    exception when others 
    then null;
    end;
  end loop;
end;
$$ Language plpgsql;

CREATE OR REPLACE FUNCTION secondVersion()
RETURNS VOID AS $$
declare
    testDate DATE;
begin
  for i in 1..999999 loop
    select now() into testDate where 1=0;
    if testDate is null then 
      continue;
    end if;
  end loop;
end;
$$ Language plpgsql;

select initialVersion(); -- 19.7 seconds

select secondVersion(); -- 5.2

正如您所看到的，差异大约为15秒。在我最初提供的示例中，差异更大，因为SELECT FROM t1针对复杂数据运行，并且需要更多时间来执行第二个示例中提供的简单SELECT。

Answer 1

我在PostgreSQL - 一般邮件组中提出了同样的问题here，并得到了一些回答，阐明了这个问题，并且＃34;神秘＆＃34;对我来说：

David G. Johnston：

＆＃34; Tip：包含EXCEPTION子句的块非常重要   进入和退出比没有一个的块更昂贵。因此，   不需要使用EXCEPTION。＆＃34;

我有点怀疑＆＃34;计划缓存＆＃34;与此有关;一世   怀疑它基本上存在高内存和运行时开销   处理需要将异常转换为的可能性   分支而不是让它致命。

汤姆莱恩：

是的，它是关于设置和结束的开销子事务。这是一个相当昂贵的机制，但我们没有任何能够从任意错误中恢复的更便宜的东西。

以及David G. Johnston的补充：

[...]设置pl / pgsql执行层以捕获＆＃34;任意SQL层异常＆＃34;相当昂贵。即使用户指定具体错误pl / pgsql中的错误处理机制是泛型的代码（任意）错误。

这些答案帮助我了解了事情的运作方式。我在这里发布这个答案，因为我希望这个答案可以帮助别人。

Answer 2

给出详细信息 - 无法重现：

t=# do
$$
declare
begin
  for i in 1..999999 loop
    perform now();
/*    exception when others then null; */
    if null then null; end if;
  end loop;
end;
$$
;
DO
Time: 1920.568 ms
t=# do
$$
declare
begin
  for i in 1..999999 loop
    begin
    perform now();
    exception when others then null;
    end;
  end loop;
end;
$$
;
DO
Time: 2417.425 ms

正如你所看到的那样，千万次迭代的差异是显而易见的，但微不足道。请在您的机器上进行相同的测试 - 如果您得到相同的结果，则需要提供更多详细信息......

为什么在LOOP中捕获错误会导致性能问题？

2 个答案: