PostgreSQL是否共享CTE的排序?

时间:2018-01-18 17:26:07

标签: sql postgresql common-table-expression

在PostgreSQL中,公用表表达式(CTE)是优化范围。这意味着CTE已实现为内存,而来自另一个查询的谓词永远不会被推入CTE。

现在我想知道关于CTE的其他元数据(例如排序)是否与其他查询共享。我们采取以下查询:

WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC LIMIT 10
)
SELECT MIN(type) FROM ordered_objects

此处MIN(type)显然始终是ordered_objects的第一行(如果NULL为空,则为ordered_objects),因为ordered_objects已按type排序{1}}。在评估ordered_objects时,是否了解SELECT MIN(type) FROM ordered_objects的相关知识?

2 个答案:

答案 0 :(得分:4)

如果我正确理解你的问题 - 不,它没有。没有这样的知识。您将在下面的示例中找到。当你限制为10行时,执行速度非常快 - 处理的数据更少(在我的情况下减少了数百万倍),这意味着CTE扫描整个有序集,忽略了min将在第一行中的事实。

数据:

t=# create table object (type bigint);
CREATE TABLE
Time: 4.636 ms
t=# insert into object select generate_series(1,9999999);
INSERT 0 9999999
Time: 7769.275 ms

有限制:

explain analyze WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC LIMIT 10
)
SELECT MIN(type) FROM ordered_objects;

执行时间:3150.183 ms

https://explain.depesz.com/s/5yXe

explain analyze WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC
)
SELECT MIN(type) FROM ordered_objects;

执行时间:16032.989 ms

https://explain.depesz.com/s/1SU

我确实在测试前预热了数据

答案 1 :(得分:2)

  • [在Postgres中] CTE 始终执行一次
  • ,即使它不止一次引用
  • 其结果存储在临时表(物化
  • 外部查询不了解内部结构(索引不可用)或排序(不确定频率估计),它只扫描临时结果
  • 在下面的片段中CTE被扫描两次,即使结果已知相同。
\d react

EXPLAIN ANALYZE
WITH omg AS (
        SELECT topic_id
        , row_number() OVER (PARTITION by krant_id ORDER BY topic_id) AS rn
        FROM react
        WHERE krant_id = 1
        AND topic_id < 5000000
        ORDER BY topic_id ASC
        )
SELECT MIN (o2.topic_id)
FROM omg o1                   --
JOIN omg o2 ON o1.rn = o2.rn  -- exactly the same
WHERE o1.rn = 1
        ;
                    Table "public.react"
   Column   |           Type           |     Modifiers      
------------+--------------------------+--------------------
 krant_id   | integer                  | not null default 1
 topic_id   | integer                  | not null
 react_id   | integer                  | not null
 react_date | timestamp with time zone | 
 react_nick | character varying(1000)  | 
 react_body | character varying(4000)  | 
 zoek       | tsvector                 | 
Indexes:
    "react_pkey" PRIMARY KEY, btree (krant_id, topic_id, react_id)
    "react_krant_id_react_nick_react_date_topic_id_react_id_idx" UNIQUE, btree (krant_id, react_nick, react_date, topic_id, react_id)
    "react_date" btree (krant_id, topic_id, react_date)
    "react_nick" btree (krant_id, topic_id, react_nick)
    "react_zoek" gin (zoek)
Triggers:
    tr_upd_zzoek_i BEFORE INSERT ON react FOR EACH ROW EXECUTE PROCEDURE tf_upd_zzoek()
    tr_upd_zzoek_u BEFORE UPDATE ON react FOR EACH ROW WHEN (new.react_body::text <> old.react_body::text) EXECUTE PROCEDURE tf_upd_zzoek()

----------

 Aggregate  (cost=232824.29..232824.29 rows=1 width=4) (actual time=1773.643..1773.645 rows=1 loops=1)
   CTE omg
     ->  WindowAgg  (cost=0.43..123557.17 rows=402521 width=8) (actual time=0.217..1246.577 rows=230822 loops=1)
           ->  Index Only Scan using react_pkey on react  (cost=0.43..117519.35 rows=402521 width=8) (actual time=0.161..419.916 rows=230822 loops=1)
                 Index Cond: ((krant_id = 1) AND (topic_id < 5000000))
                 Heap Fetches: 442
   ->  Nested Loop  (cost=0.00..99136.69 rows=4052169 width=4) (actual time=0.264..1773.624 rows=1 loops=1)
         ->  CTE Scan on omg o1  (cost=0.00..9056.72 rows=2013 width=8) (actual time=0.249..59.252 rows=1 loops=1)
               Filter: (rn = 1)
               Rows Removed by Filter: 230821
         ->  CTE Scan on omg o2  (cost=0.00..9056.72 rows=2013 width=12) (actual time=0.003..1714.355 rows=1 loops=1)
               Filter: (rn = 1)
               Rows Removed by Filter: 230821
 Total runtime: 1782.887 ms
(14 rows)