为什么postgres功能如此之慢但单个查询速度快?

时间:2015-03-17 03:02:26

标签: postgresql plpgsql postgresql-performance

我的职责是让员工处于“创建”状态。

    CREATE OR REPLACE FUNCTION get_probation_contract(AccountOrEmpcode TEXT, FromDate DATE,
                                                  ToDate           DATE)
  RETURNS TABLE("EmpId" INTEGER, "EmpCode" CHARACTER VARYING,
  "DomainAccount" CHARACTER VARYING, "JoinDate" DATE,
  "ContractTypeCode" CHARACTER VARYING, "ContractTypeName" CHARACTER VARYING,
  "ContractFrom" DATE, "ContractTo" DATE, "ContractType" CHARACTER VARYING,
  "Signal" CHARACTER VARYING) AS $$
BEGIN
  RETURN QUERY
  EXECUTE 'SELECT
    he.id                                                                       "EmpId",
    rr.code                                                                     "EmpCode",
    he.login                                                                    "DomainAccount",
    he.join_date                                                                "JoinDate",
    contract_type.code                                                          "ContractTypeCode",
    contract_type.name                                                          "ContractTypeName",
    contract.date_start                                                         "ContractFrom",
    contract.date_end                                                           "ContractTo",
    CASE WHEN contract_group.code = ''1'' THEN ''Probation''
    WHEN contract_group.code IN (''3'', ''4'', ''5'') THEN ''Official''
    WHEN contract_group.code = ''2'' THEN ''Collaborator'' END :: CHARACTER VARYING "ContractType",
    ''CREATE'' :: CHARACTER VARYING                                               "Signal"
  FROM
    hr_employee he
    INNER JOIN resource_resource rr
      ON rr.id = he.resource_id
    INNER JOIN hr_contract contract
      ON contract.employee_id = he.id AND contract.date_start = (
      SELECT max(date_start) "date_start"
      FROM hr_contract cc
      WHERE cc.employee_id = contract.employee_id
    )
    INNER JOIN hr_contract_type contract_type
      ON contract_type.id = contract.type_id
    INNER JOIN hr_contract_type_group contract_group
      ON contract_group.id = contract_type.contract_type_group_id
  WHERE
    contract_group.code = ''1''


    AND
    ($1 IS NULL OR $1 = '''' OR rr.code = $1 OR
     he.login = $1)
    AND (
      (he.join_date BETWEEN $2 AND $3)
      OR (he.join_date IS NOT NULL AND (contract.date_start BETWEEN $2 AND $3))
      OR (he.create_date BETWEEN $2 AND $3 AND he.create_date > he.join_date)
    )
    AND rr.active = TRUE
'using AccountOrEmpcode, FromDate, ToDate ;
END;
$$ LANGUAGE plpgsql;

执行

需要37秒
SELECT *
 FROM get_probation_contract('', '2014-01-01', '2014-06-01');

当我使用单一查询时

    SELECT
      he.id                                                                       "EmpId",
      rr.code                                                                     "EmpCode",
      he.login                                                                    "DomainAccount",
      he.join_date                                                                "JoinDate",
      contract_type.code                                                          "ContractTypeCode",
      contract_type.name                                                          "ContractTypeName",
      contract.date_start                                                         "ContractFrom",
      contract.date_end                                                           "ContractTo",
      CASE WHEN contract_group.code = '1' THEN 'Probation'
      WHEN contract_group.code IN ('3', '4', '5') THEN 'Official'
      WHEN contract_group.code = '2' THEN 'Collaborator' END :: CHARACTER VARYING "ContractType",
      'CREATE' :: CHARACTER VARYING                                               "Signal"
    FROM
      hr_employee he
      INNER JOIN resource_resource rr
        ON rr.id = he.resource_id
      INNER JOIN hr_contract contract
        ON contract.employee_id = he.id AND contract.date_start = (
        SELECT max(date_start) "date_start"
        FROM hr_contract
        WHERE employee_id = he.id
      )
      INNER JOIN hr_contract_type contract_type
        ON contract_type.id = contract.type_id
      INNER JOIN hr_contract_type_group contract_group
        ON contract_group.id = contract_type.contract_type_group_id
    WHERE
      contract_group.code = '1'
AND (
      (he.join_date BETWEEN '2014-01-01' AND '2014-06-01')
      OR (he.join_date IS NOT NULL AND (contract.date_start BETWEEN '2014-01-01' AND '2014-01-06'))
      OR (he.create_date BETWEEN '2014-01-01' AND '2014-01-06' AND he.create_date > he.join_date)
    )
    AND rr.active = TRUE

完成

需要5秒钟

如何优化上述功能。 以及为什么函数比单个查询慢得多,甚至我在函数中使用执行'select ...'。

在每个表的字段id中建立索引。

2 个答案:

答案 0 :(得分:2)

可能的原因是对预准备语句(嵌入式SQL)进行盲目优化。在新的PostgreSQL版本中它稍微好一些,尽管它也可能是问题所在。 PL / pgSQL中嵌入式SQL中的执行计划可以重复用于更多调用 - 并且它针对更多的值进行了优化(不是真正使用的值)。有时这种差异可能会导致非常大的减速。

然后您可以使用动态SQL - EXECUTE语句。动态SQL仅使用一次执行的计划,它使用实际参数。它应该解决这个问题。

带有重用准备计划的嵌入式SQL示例。

CREATE OR REPLACE FUNCTION fx1(_surname text)
RETURNS int AS $$
BEGIN
  RETURN (SELECT count(*) FROM people WHERE surname = _surname)
END;

动态SQL示例:

CREATE OR REPLACE FUNCTION fx2(_surname text)
RETURNS int AS $$
DECLARE result int;
BEGIN
  EXECUTE 'SELECT count(*) FROM people WHERE surname = $1' INTO result
      USING _surname;
  RETURN result;
END;
$$ LANGUAGE plpgsql;

如果您的数据集中包含一些可怕的姓氏,则第二个函数会更快 - 然后常见的计划将是seq scan,但很多时候您会询问其他姓氏,并且您会想要使用index scan 。动态查询参数化(如($1 IS NULL OR $1 = '''' OR rr.code = $1 OR)具有相同的效果。

答案 1 :(得分:1)

您的查询不一样

第一个有

WHERE cc.employee_id = contract.employee_id

第二个有:

WHERE employee_id = he.id

还有:

($1 IS NULL OR $1 = '''' OR rr.code = $1 OR
 he.login = $1)

请使用相同的查询和相同的值再次测试。