查询规划人员不使用过滤器来限制高成本加入

时间:2016-12-23 17:47:45

标签: sql postgresql query-performance

我正在尝试解决我在PostgreSQL中使用查询时出现的性能问题。

正在建模的一般概念:正在购买和分配的软件许可证。我认为我已经删除了足够的其他被建模的东西,它现在非常类似于标准的酒店房间预订系统,除了当前任务(酒店预订)没有已知的结束日期是正常的。

查询的目的:这是一个视图,用于汇总显示有关许可证及其来源的信息所需的信息。当应用程序查询视图时,它会提供tLicence.id,以便返回一行。

查询中还有非酒店式概念:

  • 某些许可协议限制了软件重新分配的速度;这已在1天内被硬编码到查询中。
  • 理论上,许可证可以同时具有过去和现在的任务;这个不应该发生,并且应用程序不鼓励它,但如果人类在现实世界中出错,应用程序确实允许将错误输入系统。这显然不同于普通的酒店系统,如果客人走进错误的房间,当前的居住者会反对。

带有别名SELECT的嵌套purchase_quantities_assignnments是数据库中的视图(为方便起见,此处内联)。理想情况下,我希望修复我的性能问题,不要求将此视图的修改版本内联到查询中;理想情况下,视图可以继续按原样存在,并在其他查询中以其他方式使用。

问题

如果我使用WHERE tLicences.id = 19查询此视图(查询),结果需要很长时间才能生成。特别是,似乎正在为periodsOfAvailability_start生成整个集合(这很慢),然后加入;这个结论是基于EXPLAIN ANALYZE GroupAggregate返回10行(这是购买数量)。我感觉就像查询规划器应该能够确定tAssignments.purchase_id可以用来显着减少需要生成periodsOfAvailability_start的数量

但是,如果我使用WHERE tLicences.id = 19 AND tLicences.purchase_id = ? [?作为该许可证的购买ID]查询此视图(查询),则查询将按预期运行,仅生成{的集合{1}}具有该购买ID;这个结论是基于EXPLAIN ANALYZE periodsOfAvailability_start返回1行(这是许可证所属的购买数量)。

查询

GroupAggregate

问题:我是否可以通过某种方式解决此问题,而无需提供SELECT * FROM test.tPurchases AS tPurchases INNER JOIN test.tLicences ON tLicences.purchase_id = tPurchases.id LEFT JOIN ( SELECT purchase_id, SUM( CASE assignment_newer_id IS NOT null WHEN true THEN 1 WHEN false THEN 0 END ) AS prchs_quantity_assigned, SUM( CASE assignment_newer_id IS null AND current_timestamp BETWEEN licence_availability_start AND licence_availability_end WHEN true THEN 1 WHEN false THEN 0 END ) AS prchs_quantity_notAssignedAndCanBeAssigned, SUM( CASE assignment_newer_id IS null AND current_timestamp < licence_availability_start WHEN true THEN 1 WHEN false THEN 0 END ) AS prchs_quantity_notAssignedAndCannotBeAssigned FROM ( SELECT tPurchases.id AS purchase_id, tPurchases.date_ AS purchase_date, tLicences.id AS licence_id, GREATEST( tPurchases.date_, older.end_, older.start + '1 day'::interval ) AS licence_availability_start, CASE WHEN newer.id IS NULL THEN 'infinity' ELSE newer.start - '1 day'::interval END AS licence_availability_end, COALESCE(newer.start, 'infinity') AS licence_availability_uninstallBy, older.id AS assignment_older_id, older.start AS assignment_older_start, older.end_ AS assignment_older_end, newer.id AS assignment_newer_id, newer.start AS assignment_newer_start, newer.end_ AS assignment_newer_end FROM test.tLicences INNER JOIN test.tPurchases ON tPurchases.id = tLicences.purchase_id LEFT JOIN test.tAssignments AS older ON ( NOT older.deleted AND older.licence_id = tLicences.id ) LEFT JOIN test.tAssignments AS newer ON ( NOT newer.deleted AND newer.id <> older.id AND newer.licence_id = older.licence_id ) WHERE NOT tLicences.deleted UNION SELECT tPurchases.id AS purchase_id, tPurchases.date_ AS purchase_date, tLicences.id AS licence_id, tPurchases.date_ AS licence_availability_start, oldest.start - '1 day'::interval AS licence_availability_end, oldest.start AS licence_availability_uninstallBy, null AS assignment_older_id, null AS assignment_older_start, null AS assignment_older_end, oldest.id AS assignment_newer_id, oldest.start AS assignment_newer_start, oldest.end_ AS assignment_newer_end FROM test.tLicences INNER JOIN test.tPurchases ON tPurchases.id = tLicences.purchase_id INNER JOIN test.tAssignments AS oldest ON oldest.licence_id = tLicences.id WHERE NOT tLicences.deleted AND NOT oldest.deleted ) AS periodsOfAvailability_start WHERE (assignment_newer_id IS null OR assignment_newer_end IS null) GROUP BY purchase_id ) AS purchase_quantities_assignnments ON purchase_quantities_assignnments.purchase_id = tPurchases.id WHERE tLicences.id = 19 /* [Unexpected behaviour] The full set for "purchase_quantities_assignnments" is generated */ --tLicences.id = 19 AND tLicences.purchase_id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated */ --tLicences.id = 19 AND tPurchases.id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated */ --tLicences.purchase_id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated. Note: This is a different query *result* than the others */

数据库版本: PostgreSQL 9.0

SQL生成架构,表格并填充这些表格:

这是一种长期运行,因为我想要一个类似于我们实际数据的数量。如果运行时有问题,可以减少许可证数量(30000)和分配数量(100000)。

tLicences.purchase_id

1 个答案:

答案 0 :(得分:0)

您可能需要运行统计信息,但通常可以使用CTE强制进行所需的优化。在这里,我还要向CTE提出所有子查询,以便明确说明:

WITH myPurchases AS
( 
  SELECT *
  FROM test.tPurchases AS tPurchases
  WHERE tLicences.id = 19 
), periodsOfAvailability_start AS
(
  SELECT
      tPurchases.id AS purchase_id,
      tPurchases.date_ AS purchase_date,
      tLicences.id AS licence_id,
      GREATEST(tPurchases.date_, older.end_, older.start + '1 day'::interval) AS licence_availability_start,
      CASE WHEN newer.id IS NULL THEN 'infinity' ELSE newer.start - '1 day'::interval END AS licence_availability_end,
      COALESCE(newer.start, 'infinity') AS licence_availability_uninstallBy,
      older.id AS assignment_older_id,
      older.start AS assignment_older_start,
      older.end_ AS assignment_older_end,
      newer.id AS assignment_newer_id,
      newer.start AS assignment_newer_start,
      newer.end_ AS assignment_newer_end
  FROM test.tLicences
  INNER JOIN myPurchases AS tPurchases ON tPurchases.id = tLicences.purchase_id
  LEFT JOIN test.tAssignments AS older ON (NOT older.deleted AND older.licence_id = tLicences.id)
  LEFT JOIN test.tAssignments AS newer ON (NOT newer.deleted AND newer.id <> older.id AND newer.licence_id = older.licence_id)
  WHERE NOT tLicences.deleted

  UNION

  SELECT
      tPurchases.id AS purchase_id,
      tPurchases.date_ AS purchase_date,
      tLicences.id AS licence_id,
      tPurchases.date_ AS licence_availability_start,
      oldest.start - '1 day'::interval AS licence_availability_end,
      oldest.start AS licence_availability_uninstallBy,
      null AS assignment_older_id,
      null AS assignment_older_start,
      null AS assignment_older_end,
      oldest.id AS assignment_newer_id,
      oldest.start AS assignment_newer_start,
      oldest.end_ AS assignment_newer_end
  FROM test.tLicences
  INNER JOIN myPurchases AS tPurchases ON tPurchases.id = tLicences.purchase_id
  INNER JOIN test.tAssignments AS oldest ON oldest.licence_id = tLicences.id
  WHERE NOT tLicences.deleted AND NOT oldest.deleted
), purchase_quantities_assignnments AS
(
  SELECT
    purchase_id,
    SUM(CASE WHEN assignment_newer_id IS NOT null THEN 1 ELSE 0 END) AS prchs_quantity_assigned,
    SUM(CASE WHEN assignment_newer_id IS null AND current_timestamp BETWEEN licence_availability_start AND licence_availability_end THEN 1 ELSE false END) AS prchs_quantity_notAssignedAndCanBeAssigned,
    SUM(CASE WHEN assignment_newer_id IS null AND current_timestamp < licence_availability_start THEN 1 ELSE 0 END) AS prchs_quantity_notAssignedAndCannotBeAssigned
  FROM periodsOfAvailability_start
  WHERE assignment_newer_id IS null OR assignment_newer_end IS null
  GROUP BY purchase_id
)
SELECT *
FROM myPurchases AS tPurchases
INNER JOIN test.tLicences ON tLicences.purchase_id = tPurchases.id
LEFT JOIN purchase_quantities_assignnments ON purchase_quantities_assignnments.purchase_id = tPurchases.id