PostgreSQL规划人员正在通过更快的嵌套循环选择哈希/合并联接

时间:2018-06-19 16:23:46

标签: postgresql nested-loops postgresql-performance

我有两个表tbl11tbl2分别具有700M和900M行。
当我在'tbl1'中找到一个值并将其与tbl2联接时,优化器选择一个嵌套循环联接并进行索引扫描并运行得更快。但是当我尝试对500个值执行相同操作时,它会选择哈希和合并联接并永远运行。但是,当我将enable_hashjoinenable_mergejoin设置为off时,它会选择一个嵌套循环联接并像超级按钮一样运行。

为什么PostgreSQL不选择嵌套循环联接,因为它在这种情况下运行得更快?

**Query plan with hash and merge turned off:**

Sort  (cost=598690203560.58..598690203860.89 rows=120124 width=336)
Sort Key: cte_fl3_837_835_all.submitintchgtime_isa_09_10 DESC, cte_fl3_837_835_all.submitclaimid_2300_clm_01 DESC
  CTE cte_my_claims
   ->Nested Loop  (cost=0.00..139.48 rows=4734 width=32)
       ->Seq Scan on test_fl3_837p_claims  (cost=0.00..31.89 rows=789 width=13)
          ->Materialize  (cost=0.00..1.09 rows=6 width=3)
              ->Seq Scan on all_market_claim_id_manipulation  (cost=0.00..1.06 rows=6 width=3)
  CTE cte_fl3_837p_clm
    ->Nested Loop  (cost=0.56..59755293.38 rows=17689084 width=48)
          ->CTE Scan on cte_my_claims clml  (cost=0.00..94.68 rows=4734 width=32)
          ->  Index Scan using pidx_fl3_837p_segments_clm_id on fl3_837p_segments p837  (cost=0.56..12585.19 rows=3737 width=48)
           Index Cond: (element_01 = clml.part_claim_id)


**With Hash and Merge On:**

Sort  (cost=7081980295.88..7081980596.19 rows=120124 width=336)
Sort Key: cte_fl3_837_835_all.submitintchgtime_isa_09_10 DESC, cte_fl3_837_835_all.submitclaimid_2300_clm_01 DESC
  CTE cte_my_claims
    ->Nested Loop  (cost=0.00..139.48 rows=4734 width=32)
         ->Seq Scan on test_fl3_837p_claims  (cost=0.00..31.89 rows=789 width=13)
          ->Materialize  (cost=0.00..1.09 rows=6 width=3)
               ->Seq Scan on all_market_claim_id_manipulation  (cost=0.00..1.06 rows=6 width=3)
  CTE cte_fl3_837p_clm
    ->Merge Join  (cost=26624634.82..26941108.79 rows=17689084 width=48)
          Merge Cond: (clml.part_claim_id = p837.element_01)
          ->Sort  (cost=383.66..395.50 rows=4734 width=32)
                Sort Key: clml.part_claim_id
                ->CTE Scan on cte_my_claims clml  (cost=0.00..94.68 rows=4734 width=32)
        ->Materialize  (cost=26624251.16..26717290.79 rows=18607927 width=48)
             ->Sort(cost=26624251.16..26670770.98 rows=18607927 width=48)
                Sort Key: p837.element_01
                ->Bitmap Heap Scan on fl3_837p_segments p837  (cost=482386.53..22660154.99 rows=18607927 width=48)
                    Recheck Cond: ((segment_type)::text = 'CLM'::text)
                     ->Bitmap Index Scan on pidx_fl3_837p_segments_clm_series_id  (cost=0.00..477734.55 rows=18607927 width=0)

0 个答案:

没有答案