Question

我们有以下存储过程，最近在postgres db中的较大日期执行速度非常慢。问题：

我们解析一个字符串的性质（第一个数字是行的id，第二个是状态）

|| 2 | 0 || 3 | 1 || 4 | 0 ||

解析，拆分字符串和循环使用像java这样的更高语言是否更好？ Postgres中的循环可以更有效吗？如何在存储过程中处理事务？整个功能是一个交易？可能我们在db上做了很多写操作和删除操作。删除也需要很长时间。这可以更有效地处理吗？

CREATE OR REPLACE FUNCTION verificaemitidos(entrada text, largo_mensaje integer)
  RETURNS character AS
$BODY$
    DECLARE                 
    texto_procesado text;       
    identificador bigint;
    estado_mensaje int;
    i int;
    existe_documento int;
    estado_documento text;
    rut numeric;
    tipo int;
    folio_doc numeric;
    otros_estados int;
    BEGIN
        --estado 1 insertado
        --estado 0 no insertado
        --mensaje id_documento|estado||id_documento|estado||


        i := 1;
        while (i <= largo_mensaje)
        loop            
            --Proceso el mensaje
            texto_procesado := split_part(entrada,'||', i) ;
            identificador := split_part(texto_procesado, '|', 1);   
            estado_mensaje := split_part(texto_procesado, '|', 2);              
            -- Se comienza a hacer la comparacion
            existe_documento := (select count (id) from uris_emitidos where id = identificador);
            select estado, emp_rut, tipo_doc, folio into estado_documento, rut, tipo, folio_doc from uris_emitidos where id = identificador;

            --si existe el documento            
            if (existe_documento > 0) then              
                --si el documento que se ingreso esta insertado
                if (estado_mensaje = 1) then
                    --si esta aceptado se eliminan todos los documentos con ese rut, tipo, folio
                    if (estado_documento = 'A') then
                        delete from uris_emitidos where folio = folio_doc and emp_rut = rut and tipo_doc = tipo;
                    end if;
                    --si esta aceptado con reparo se eliminan todos los documentos con ese rut, tipo, folio
                    if (estado_documento = 'B') then
                        delete from uris_emitidos where folio = folio_doc and emp_rut = rut and tipo_doc = tipo;
                    end if;
                    --si esta rechazado se elimina el rechazado y el publicado
                    if (estado_documento = 'R') then
                        delete from uris_emitidos where folio = folio_doc and emp_rut = rut and tipo_doc = tipo and estado in ('R', 'P');
                    end if;
                    --si esta publicado se elimina
                    if (estado_documento = 'P') then
                        delete from uris_emitidos where id = identificador;
                    end if;
                --si el documento que se ingreso no esta insertado              
                else
                    --si esta aceptado se actualiza para que el proceso lo re-encole
                    if (estado_documento = 'A') then 
                        update uris_emitidos set estado_envio = 0, cont = (cont + 1) where id = identificador;                      
                    end if;
                    --si esta aceptado con reparo se actualiza para que el proceso lo re-encole
                    if (estado_documento = 'B') then
                        update uris_emitidos set estado_envio = 0, cont = (cont + 1) where id = identificador;                      
                    end if;
                    --si esta rechazado se verifica que no existe un registro aceptado que se haya encolado o este en espera de encolar
                    if (estado_documento = 'R') then
                        otros_estados = (select count(id) from uris_emitidos ue where ue.folio = folio_doc and ue.emp_rut = rut and ue.tipo_doc = tipo and ue.estado in ('A', 'B'));
                        --si otros estados = 0 significa que el estado rechazado es el mejor estado que hay, por lo tanto se debe re-encolar
                        if (otros_estados = 0) then
                            update uris_emitidos set estado_envio = 0, cont = (cont + 1) where id = identificador;
                        end if;
                    end if;
                    --si esta rechazado se verifica que no existe un registro aceptado o rechazado que se haya encolado o este en espera de encolar
                    if (estado_documento = 'P') then
                        otros_estados = (select count(id) from uris_emitidos where folio = folio_doc and emp_rut = rut and tipo_doc = tipo and estado in ('A', 'B', 'R'));
                        --si otros estados = 0 significa que el estado rechazado es el mejor estado que hay, por lo tanto se debe re-encolar
                        if (otros_estados = 0) then
                            update uris_emitidos set estado_envio = 0, cont = (cont + 1) where id = identificador;
                        end if;
                    end if;

                end if;

            end if;

            i := i+1;
        end loop;
        return 'ok';


    END;
$BODY$
  LANGUAGE plpgsql VOLATILE;

Answer 1

pgsql中的循环可以更有效吗？

正如@wildplasser所提到的，运行操作行集的SQL语句通常比单独操作每一行要快得多。循环只能在plpgsql（或其他过程语言函数中，或以有限的方式，在递归CTE中），而不是在纯SQL中。他们的工作做得很好，但不是PostgreSQL的强项。

如何在存储过程中处理事务？整个功能是一个交易？

是的，整个功能作为一个事务运行。它可以是更大交易的一部分，但不能拆分。

阅读有关plpgsql函数如何在此related answer on dba.SE中工作的更多信息。

解析，拆分字符串并循环使用更高级的语言（如java？
）是否更好？

如果字符串不是很大（数千个元素），只要你的逻辑是合理的，它就没关系。这不是字符串解析会降低你的速度。这是表中行的“一次一行”操作。

快得多替代方案是在一个或几个SQL语句中完成所有操作。我会使用data modifying CTEs（在PostgreSQL 9.1中引入）：解析一次字符串，并在此内部工作表上运行DML语句。

考虑以下演示（未经测试）：

WITH a(t) AS (  -- split string into rows
    SELECT unnest(string_to_array(trim('||2|0||3|1||4|0||'::text, '|'), '||'))
    )
    , b AS (    -- split record into columns per row
    SELECT split_part(t, '|', 1) AS identificador 
          ,split_part(t, '|', 2) AS estado_mensaje 
    FROM   a
    )
    , c AS (    -- implements complete IF branch of your base loop
    DELETE FROM uris_emitidos u
    USING  b
    WHERE  u.id = b.identificador
    AND    u.estado IN ('A','B','R','P')
    AND    b.estado_mensaje = 1
    )
--  , d AS (    -- implements ELSE branch of your loop
--  DELETE ...
--  )
SELECT 'ok':

添加主要的设计缺陷，循环中的逻辑是多余的和不一致的。我将整个IF分支合并到上面的第一个DELETE语句中。

有关手册here中使用的功能的更多信息。

Answer 2

当你看到它时，可怕的函数的参数（除了largo_mensaje）可以被视为工作清单的字段：

CREATE TABLE worklist
    ( texto_procesado text
    , identificador bigint
    , estado_mensaje int
    );

相应的工作可以像（我从Erwin的答案中借用）：

DELETE FROM uris_emitidos u
 USING  worklist wl
 WHERE  u.id = wl.identificador
   AND    u.estado IN ('A','B','R','P')
   AND    wl.estado_mensaje = 1
   AND    wl.texto_procesado IN ('action1' , 'action2', ...)
    ;

，然后必须清理工作清单（而不是EXISTS（SELECT * FROM uris_emitidos WHERE））;

Answer 3

我现在提出的解决方案如下。绝对比循环更快，并减少db上的写入和读取次数。感谢

CREATE OR REPLACE FUNCTION verificaemitidos2(entrada text, largo_mensaje integer)
  RETURNS character AS
$BODY$
    DECLARE
    texto_procesado text;
    identificador bigint;
    estado_mensaje int;
    i int;
    existe_documento int;
    estado_documento text;
    rut numeric;
    tipo int;
    folio_doc numeric;
    otros_estados int;
    BEGIN
        --estado 1 insertado
        --estado 0 no insertado
        --mensaje id_documento|estado||id_documento|estado||

        --DROP TABLE worklist;
        CREATE TEMP TABLE worklist
            ( identificador bigint,
              estado_mensaje int,
              rut_emisor numeric,
              tipo_doc numeric,
              folio numeric,
              estado text
            );

        INSERT INTO worklist (identificador, estado_mensaje, rut_emisor, tipo_doc, folio, estado)
            SELECT split_part(t, '|', 1)::bigint ,
              split_part(t, '|', 2)::integer ,
              uri.emp_rut,
              uri.tipo_doc,
              uri.folio,
              uri.estado
              from (SELECT unnest(string_to_array(trim(entrada::text, '|'), '||'))) as a(t),
              uris_emitidos uri
              WHERE uri.id = split_part(t, '|', 1)::bigint;

        -- ESTADO 1
        -- ACEPTADOS PRIMEROS DOS CASOS

        DELETE FROM uris_emitidos u
         USING  worklist wl
         WHERE  wl.estado_mensaje = 1
           AND  wl.estado IN ('A', 'B')
           AND  u.folio = wl.folio
           AND  u.emp_rut = wl.rut_emisor
           AND  u.tipo_doc =  wl.tipo_doc;

        -- ESTADO 1
        -- CASO 3

        --delete from uris_emitidos where folio = folio_doc and emp_rut = rut and tipo_doc = tipo and estado in ('R', 'P');

        DELETE FROM uris_emitidos u
         USING  worklist wl
         WHERE  wl.estado_mensaje = 1
           AND  wl.estado IN ('R')
           AND  u.estado IN ('R', 'P')
           AND  u.folio = wl.folio
           AND  u.emp_rut = wl.rut_emisor
           AND  u.tipo_doc =  wl.tipo_doc;

        -- ESTADO 1
        -- CASO 4

         DELETE FROM uris_emitidos u
         USING  worklist wl
         WHERE  u.id = wl.identificador
           AND  wl.estado_mensaje = 1
           AND  wl.estado = 'P';

        -- ESTADO 0
        -- CASOS 1+2

        UPDATE uris_emitidos u
        SET estado_envio = 0, cont =  (u.cont + 1)
        FROM worklist wl
        WHERE  u.id = wl.identificador
        AND  wl.estado_mensaje = 0
        AND  wl.estado IN ('A' , 'B');

         -- update uris_emitidos set estado_envio = 0, cont = (cont + 1) where id = identificador;

        -- ESTADO 0
        -- CASO 3

        UPDATE uris_emitidos u
        SET estado_envio = 0, cont =  (u.cont + 1)
        FROM worklist wl
        WHERE  u.id = wl.identificador
        AND  wl.estado_mensaje = 0
        AND  wl.estado IN ('R')
        AND NOT EXISTS (
        SELECT 1 FROM uris_emitidos ue
        WHERE ue.folio = wl.folio
        AND ue.emp_rut = wl.rut_emisor
        AND ue.tipo_doc = wl.tipo_doc
        AND ue.estado IN ('A', 'B'));

        -- ESTADO 0
        -- CASO 4

        UPDATE uris_emitidos u
        SET estado_envio = 0, cont =  (u.cont + 1)
        FROM worklist wl
        WHERE  u.id = wl.identificador
        AND  wl.estado_mensaje = 0
        AND  wl.estado IN ('P')
        AND NOT EXISTS (
        SELECT 1 FROM uris_emitidos ue
        WHERE ue.folio = wl.folio
        AND ue.emp_rut = wl.rut_emisor
        AND ue.tipo_doc = wl.tipo_doc
        AND ue.estado IN ('A', 'B', 'R'));

        DROP TABLE worklist;

        RETURN 'ok';
    END;
$BODY$
  LANGUAGE plpgsql VOLATILE;

Answer 4

您的posted answer可以改进和简化：

CREATE OR REPLACE FUNCTION x.verificaemitidos3(_entrada text)
  RETURNS text AS
$BODY$
BEGIN
   --estado 1 insertado
   --estado 0 no insertado

   CREATE TEMP TABLE worklist ON COMMIT DROP AS
   SELECT split_part(t, '|', 1)::bigint AS identificador
         ,split_part(t, '|', 2)::integer AS estado_mensaje
         ,uri.emp_rut AS rut_emisor
         ,uri.tipo_doc
         ,uri.folio
         ,uri.estado
   FROM  (SELECT unnest(string_to_array(trim(_entrada::text, '|'), '||'))) a(t)
   JOIN   uris_emitidos uri ON uri.id = split_part(t, '|', 1)::bigint;

   -- ESTADO 1

   DELETE FROM uris_emitidos u
   USING  worklist w
   WHERE  w.estado_mensaje = 1
   AND   (
         (w.estado IN ('A', 'B')   -- CASOS 1+2
      OR  w.estado =   'R'         -- CASO 3
      AND u.estado IN ('R', 'P')
      )
      AND u.folio = w.folio
      AND u.emp_rut = w.rut_emisor
      AND u.tipo_doc =  w.tipo_doc

      OR (w.estado = 'P'           -- CASO 4
      AND w.identificador = u.id
      )
      );

   -- ESTADO 0

   UPDATE uris_emitidos u
   SET    estado_envio = 0
         ,cont = cont + 1
   FROM   worklist w
   WHERE  w.estado_mensaje = 0
   AND    w.identificador = u.id
   AND   (w.estado IN ('A', 'B')   -- CASOS 1+2

      OR  w.estado = 'R'           -- CASO 3
      AND NOT EXISTS (
         SELECT 1
         FROM   uris_emitidos ue
         WHERE  ue.folio = w.folio
         AND    ue.emp_rut = w.rut_emisor
         AND    ue.tipo_doc = w.tipo_doc
         AND    ue.estado IN ('A', 'B')
         )

      OR  w.estado = 'P'         -- CASO 4
      AND NOT EXISTS (
         SELECT 1
         FROM   uris_emitidos ue
         WHERE  ue.folio = w.folio
         AND    ue.emp_rut = w.rut_emisor
         AND    ue.tipo_doc = w.tipo_doc
         AND    ue.estado IN ('A', 'B', 'R')
         )
      );

   RETURN 'ok';
END;
$BODY$  LANGUAGE plpgsql VOLATILE;

重点

一半长，两倍快。
删除第二个功能参数。你不再需要它了。
删除所有变量。它们都不再使用了。
使用ON COMMIT DROP可避免在一个会话中重复呼叫时与现有临时表冲突。在Postgres 9.1中，您可以使用CREATE TABLE IF EXISTS
直接根据SELECT CREATE TABLE AS的结果创建临时表。
合并所有DELETE s。
合并所有UPDATE s。
修剪一些噪音。
使用返回类型text代替character。

pgplsql存储过程的性能？

4 个答案:

重点