Postgresql 10.1 - 两个类似的查询,200倍的差异

时间:2017-11-30 19:02:44

标签: postgresql postgresql-performance

两个查询。首先是最长200倍的第二次。为什么? PostgreSql 10.1。 Metro和Sel - 在同一张桌子上观看。

EXPLAIN ANALYZE
SELECT *
FROM (
       SELECT
         metro.id    AS id,
         metro.title AS name,
         metro.c1
       FROM metro
         LEFT JOIN sel
           ON metro.id = sel.metrosku
       WHERE sel.id IS NULL) t
WHERE t.c1 = 'продукты'
LIMIT 100;

EXPLAIN ANALYZE
WITH t AS (SELECT
             metro.id    AS id,
             metro.title AS name,
             metro.c1
           FROM metro
             LEFT JOIN sel
               ON metro.id = sel.metrosku
           WHERE sel.id IS NULL)
SELECT *
FROM t
WHERE t.c1 = 'продукты'
LIMIT 100;

查询1:

"QUERY PLAN" Limit  (cost=0.00..34190.48 rows=1 width=96) (actual time=532.298..86938.359 rows=100 loops=1)
->  Nested Loop Left Join  (cost=0.00..34190.48 rows=1 width=96) (actual time=532.298..86938.274 rows=100 loops=1) Join Filter: (lower((original.info ->> 'SKU'::text)) = "substring"(((original_1.info -> 'Images'::text) ->> '0'::text), '/(\d+)'::text)) Rows Removed by Join Filter: 3555434 Filter: (lower((original_1.info ->> 'SKU'::text)) IS NULL) Rows Removed by Filter: 99
->  Seq Scan on original  (cost=0.00..17432.97 rows=1 width=1185) (actual time=0.038..2.962 rows=199 loops=1) Filter: (((competitor)::text = 'metrocc'::text) AND ((info ->> 'Type'::text) = 'Item'::text) AND (lower(((info -> 'Catalog'::text) ->> '0'::text)) = 'продукты'::text)) Rows Removed by Filter: 63
->  Seq Scan on original original_1  (cost=0.00..16754.80 rows=90 width=1185) (actual time=0.484..169.594 rows=17867 loops=199) Filter: (((competitor)::text = 'sel'::text) AND ((info ->> 'Type'::text) = 'Item'::text)) Rows Removed by Filter: 49950 Planning time: 0.471 ms Execution time: 86938.450 ms

查询2:

"QUERY PLAN"
Limit  (cost=33521.79..33521.82 rows=1 width=96) (actual time=425.243..443.735 rows=100 loops=1)
CTE t
->  Hash Left Join  (cost=16755.92..33521.79 rows=1 width=96) (actual time=425.239..443.574 rows=140 loops=1)
Hash Cond: (lower((original.info ->> 'SKU'::text)) = "substring"(((original_1.info -> 'Images'::text) ->> '0'::text), '/(\d+)'::text))
Filter: (lower((original_1.info ->> 'SKU'::text)) IS NULL)
Rows Removed by Filter: 82
->  Seq Scan on original  (cost=0.00..16754.80 rows=144 width=1185) (actual time=0.022..7.077 rows=1638 loops=1)
Filter: (((competitor)::text = 'metrocc'::text) AND ((info ->> 'Type'::text) = 'Item'::text))
Rows Removed by Filter: 54
->  Hash  (cost=16754.80..16754.80 rows=90 width=1185) (actual time=424.723..424.723 rows=16215 loops=1)
Buckets: 4096 (originally 1024)  Batches: 8 (originally 1)  Memory Usage: 4066kB
->  Seq Scan on original original_1  (cost=0.00..16754.80 rows=90 width=1185) (actual time=0.612..175.330 rows=17867 loops=1)
Filter: (((competitor)::text = 'sel'::text) AND ((info ->> 'Type'::text) = 'Item'::text))
Rows Removed by Filter: 49950
->  CTE Scan on t  (cost=0.00..0.02 rows=1 width=96) (actual time=425.242..443.716 rows=100 loops=1)
Filter: (c1 = 'продукты'::text)
Rows Removed by Filter: 40
Planning time: 0.451 ms
Execution time: 449.512 ms

3 个答案:

答案 0 :(得分:3)

在postgresql中,如果使用了cte的输出,它首先被实现,然后被引用。没有谓词下推。

这是一种已知的行为,并在此处记录:

https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/

正如其他人所指出的,在解释本身就很明显。

答案 1 :(得分:0)

它位于EXPLAIN PLAN本身。

首次查询费用为0.00..34190.48。因此,返回第一行的成本接近0。由于您只需要第一行100,因此它的运行方式比第二行快。

答案 2 :(得分:0)

EXPLAIN PLAN的结果:
查询1(子查询):计划时间:0.471毫秒执行时间:86938.450毫秒 查询2(CTE):计划时间:0.451 ms执行时间:449.512 ms

根据您的结果,CTE表单比子查询表单更快,在这种情况下,计划程序做出了很差的优化决策。

您可以尝试使用EXPLAIN (BUFFERS,ANALYZE)细节可能会有所帮助。