Question

我有这个非常长的查询，我将在这里总结并在底部粘贴：

select * from
a
left join b t1 on a.x = b.x
left join b t2 on a.y = b.x
left join b t3 on a.z = b.x
left join c on a.1 = c.1 and a.2 = c.2 and a.3 = c.3 --call this predicate 1
where c.z is null

a和c有主键1,2,3 unclustered a.x y或z可以为null 您将在下面链接的内容中看到a是40k行，c是500k行，b是7k行。此查询需要10分钟。手动执行excel更快。我的行计数估计都是错误的，即使我运行真空全分析并且它有嵌套循环，它不应该

这是完整的 https://explain.depesz.com/s/w2uN

当我删除谓词1时，所有嵌套循环都消失了，行估计仍然是错误的。在我重置postgre之后运行还需要不到一秒的时间，因此没有缓存

https://explain.depesz.com/s/O7R

有关如何强制散列连接的任何想法？或者我可以在40k表上构建3个索引，我会截断并加载到所有时间，因为这实际上是一个每周都刷新的临时表。看起来像矫枉过正，但可能不会伤害任何人。除了真空分析之外，如何在计划器中获得正确的行数也很少。关于这个的任何想法？

最后，这是完整的代码

SELECT
    ar.cocd,
    ar.customer,
    ar.sales_doc,
    ar.documentno,
    ar.headertext,
    ar.clrng_doc,
    ar.typ,
    ar.net_due_dt,
    ar.amt,
    ar.lcurr,
    ar.amount_in_dc,
    ar.curr,
    ar.text,
    ar.doc_date,
    ar.clearing,
    ar.po_number,
    ar.payt,
    ar.st,
    ar.arrear,
    ar.gl,
    ar.user_name,
    ar.tcod,
    ar.itm,
    ar.inv_ref,
    ar.amount_in_loccurr2,
    ar.pmnt_date,
    ar.pk,
    ar.pstng_date,
    ar.account,
    ar.accty,
    ar.aging_bucket,
    ar.billdoc,
    ar.ftyp,
    ar.general_ledger_amt,
    ar.offstacct,
    ar.pmtmthsu,
    ar.purchdoc,
    ar.rcd,
    ar.transtype,
    ar.ym,
    COALESCE(ar_f2.branch, ar_f2.subbranch, ar_f2.account) AS forecast_company,
    ar_f.customer_name AS paying_company,
    ar_f3.customer_name AS shipping_company
   FROM h_ar_open ar
     LEFT JOIN h_ar_forecast ar_f ON ar.customer = ar_f.customer
     LEFT JOIN h_ar_forecast ar_f2 ON ar.soldto::double precision = ar_f2.customer
     LEFT JOIN h_ar_forecast ar_f3 ON ar.shipto::double precision = ar_f3.customer
     LEFT JOIN h_ar_hist hist ON ar.cocd = hist.cocd AND ar.itm = hist.itm AND ar.documentno = hist.documentno
  WHERE hist.documentno IS NULL;

Answer 1

评论中的索引有很多帮助，因为它使嵌套循环连接更快。超过7000行的46417次连续扫描很糟糕。

你的问题是错误的。

也许您可以通过以下方法强制加入订单：

SELECT ...
FROM (SELECT ...
      FROM a
         LEFT JOIN b t1 ...
         LEFT JOIN b t3 ...
         LEFT JOIN b t3 ...
      OFFSET 0) x
   LEFT JOIN c ...;

如果c最后加入，则错误估计不会造成伤害。

Postgres散列连接vs嵌套循环决策

1 个答案: