Full Fiddle

Question

我有一个事件表，它具有非常相似的模式和数据分布，可以很容易地在本地生成：

CREATE TABLE events AS
WITH args AS (
    SELECT
        300 AS scale_factor, -- feel free to reduce this to speed up local testing
        1000 AS pa_count,
        1 AS l_count_min,
        29 AS l_count_rand,
        10 AS c_count,
        10 AS pr_count,
        3 AS r_count,
        '10 days'::interval AS time_range -- edit 2017-05-02: the real data set has years worth of data here, but the query time ranges stay small (a couple days)
)

SELECT
    p.c_id,
    'ABC'||lpad(p.pa_id::text, 13, '0') AS pa_id,
    'abcdefgh-'||((random()*(SELECT pr_count-1 FROM args)+1))::int AS pr_id,
    ((random()*(SELECT r_count-1 FROM args)+1))::int AS r,
    '2017-01-01Z00:00:00'::timestamp without time zone + random()*(SELECT time_range FROM args) AS t
FROM (
    SELECT
        pa_id,
        ((random()*(SELECT c_count-1 FROM args)+1))::int AS c_id,
        (random()*(SELECT l_count_rand FROM args)+(SELECT l_count_min FROM args))::int AS l_count
    FROM generate_series(1, (SELECT pa_count*scale_factor FROM args)) pa_id
) p
JOIN LATERAL (
    SELECT generate_series(1, p.l_count)
) l(id) ON (true);

摘录自SELECT * FROM events：

我需要的是一个查询，在c_id的给定时间范围内选择给定t的所有行，然后将其过滤为仅包含最新行（t }）为每个唯一的pr_id和pa_id组合，然后计算这些行的pr_id和r组合的数量。

这是一个非常令人满意的，所以这里有3个我提出的SQL查询产生了预期的结果：

WITH query_a AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id,
          pa_id,
          r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        ORDER BY pr_id, pa_id, t DESC
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
),


query_b AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT
          pr_id,
          pa_id,
          first_not_null(r ORDER BY t DESC) AS r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        GROUP BY
          1,
          2
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
),

query_c AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT
          pr_id,
          pa_id,
          first_not_null(r) AS r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        GROUP BY
          1,
          2
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
)

以下是query_b和query_c使用的自定义聚合函数，以及我认为最佳的索引，设置和条件：

CREATE FUNCTION first_not_null_agg(before anyelement, value anyelement) RETURNS anyelement
    LANGUAGE sql IMMUTABLE STRICT
    AS $_$
  SELECT $1;
$_$;


CREATE AGGREGATE first_not_null(anyelement) (
    SFUNC = first_not_null_agg,
    STYPE = anyelement
);


CREATE INDEX events_idx ON events USING btree (c_id, t DESC, pr_id, pa_id, r);
VACUUM ANALYZE events;
SET work_mem='128MB';

我的困境是query_c优于query_a和query_b优于＆gt; 6x，但在技术上不保证产生与其他查询相同的结果（注意ORDER BY聚合中缺少的first_not_null）。但是，在实践中，它似乎选择了一个我认为正确且最优化的查询计划。

以下是本地计算机上所有3个查询的EXPLAIN (ANALYZE, VERBOSE)输出：

query_a：

CTE Scan on query_a  (cost=25810.77..26071.25 rows=13024 width=44) (actual time=3329.921..3329.934 rows=30 loops=1)
  Output: query_a.pr_id, query_a.r, query_a.quantity
  CTE query_a
    ->  Sort  (cost=25778.21..25810.77 rows=13024 width=23) (actual time=3329.918..3329.921 rows=30 loops=1)
          Output: events.pr_id, events.r, (count(1))
          Sort Key: (count(1)), events.r, events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=24757.86..24888.10 rows=13024 width=23) (actual time=3329.849..3329.892 rows=30 loops=1)
                Output: events.pr_id, events.r, count(1)
                Group Key: events.pr_id, events.r
                ->  Unique  (cost=21350.90..22478.71 rows=130237 width=40) (actual time=3168.656..3257.299 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, events.r, events.t
                      ->  Sort  (cost=21350.90..21726.83 rows=150375 width=40) (actual time=3168.655..3209.095 rows=153795 loops=1)
                            Output: events.pr_id, events.pa_id, events.r, events.t
                            Sort Key: events.pr_id, events.pa_id, events.t DESC
                            Sort Method: quicksort  Memory: 18160kB
                            ->  Index Only Scan using events_idx on public.events  (cost=0.56..8420.00 rows=150375 width=40) (actual time=0.038..101.584 rows=153795 loops=1)
                                  Output: events.pr_id, events.pa_id, events.r, events.t
                                  Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                                  Heap Fetches: 0
Planning time: 0.316 ms
Execution time: 3331.082 ms

query_b：

CTE Scan on query_b  (cost=67140.75..67409.53 rows=13439 width=44) (actual time=3761.077..3761.090 rows=30 loops=1)
  Output: query_b.pr_id, query_b.r, query_b.quantity
  CTE query_b
    ->  Sort  (cost=67107.15..67140.75 rows=13439 width=23) (actual time=3761.074..3761.081 rows=30 loops=1)
          Output: events.pr_id, (first_not_null(events.r ORDER BY events.t DESC)), (count(1))
          Sort Key: (count(1)), (first_not_null(events.r ORDER BY events.t DESC)), events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=66051.24..66185.63 rows=13439 width=23) (actual time=3760.997..3761.049 rows=30 loops=1)
                Output: events.pr_id, (first_not_null(events.r ORDER BY events.t DESC)), count(1)
                Group Key: events.pr_id, first_not_null(events.r ORDER BY events.t DESC)
                ->  GroupAggregate  (cost=22188.98..63699.49 rows=134386 width=32) (actual time=2961.471..3671.669 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, first_not_null(events.r ORDER BY events.t DESC)
                      Group Key: events.pr_id, events.pa_id
                      ->  Sort  (cost=22188.98..22578.94 rows=155987 width=40) (actual time=2961.436..3012.440 rows=153795 loops=1)
                            Output: events.pr_id, events.pa_id, events.r, events.t
                            Sort Key: events.pr_id, events.pa_id
                            Sort Method: quicksort  Memory: 18160kB
                            ->  Index Only Scan using events_idx on public.events  (cost=0.56..8734.27 rows=155987 width=40) (actual time=0.038..97.336 rows=153795 loops=1)
                                  Output: events.pr_id, events.pa_id, events.r, events.t
                                  Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                                  Heap Fetches: 0
Planning time: 0.385 ms
Execution time: 3761.852 ms

query_c：

CTE Scan on query_c  (cost=51400.06..51660.54 rows=13024 width=44) (actual time=524.382..524.395 rows=30 loops=1)
  Output: query_c.pr_id, query_c.r, query_c.quantity
  CTE query_c
    ->  Sort  (cost=51367.50..51400.06 rows=13024 width=23) (actual time=524.380..524.384 rows=30 loops=1)
          Output: events.pr_id, (first_not_null(events.r)), (count(1))
          Sort Key: (count(1)), (first_not_null(events.r)), events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=50347.14..50477.38 rows=13024 width=23) (actual time=524.311..524.349 rows=30 loops=1)
                Output: events.pr_id, (first_not_null(events.r)), count(1)
                Group Key: events.pr_id, first_not_null(events.r)
                ->  HashAggregate  (cost=46765.62..48067.99 rows=130237 width=32) (actual time=401.480..459.962 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, first_not_null(events.r)
                      Group Key: events.pr_id, events.pa_id
                      ->  Index Only Scan using events_idx on public.events  (cost=0.56..8420.00 rows=150375 width=32) (actual time=0.027..109.459 rows=153795 loops=1)
                            Output: events.c_id, events.t, events.pr_id, events.pa_id, events.r
                            Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                            Heap Fetches: 0
Planning time: 0.296 ms
Execution time: 525.566 ms

从广义上讲，我认为上面的索引应该允许query_a和query_b执行，而不会使Sort节点放慢速度，但到目前为止我还没能说服postgres查询优化器做我的出价。

考虑到quicksort不稳定，我对t列没有包含在query_b的排序键中感到有些困惑。这似乎会产生错误的结果。

我已经验证所有3个查询都生成相同的结果，运行以下查询并验证它们是否产生空结果集：

SELECT * FROM query_a
EXCEPT
SELECT * FROM query_b;

和

SELECT * FROM query_a
EXCEPT
SELECT * FROM query_c;

如果有疑问，我会认为query_a是规范查询。

我非常感谢有关此的任何意见。我实际上找到了terribly hacky workaround以在我的应用程序中获得可接受的性能，但是这个问题在我的睡眠中继续捕获我（实际上是假期，我现在正在这里）....

FWIW，我看过许多类似的问题和答案，这些问题和答案引导了我当前的想法，但我相信两列分组（pr_id，pa_id）并且不得不按第3列（t）排序，这不会使其成为重复的问题。

编辑：示例中的外部查询可能与问题完全无关，因此如果有问题，请随意忽略它们。

Answer 1

如果有疑问，我会认为query_a是规范查询。

我找到了让query_a快半秒的方法。

来自query_a

的内部查询

SELECT DISTINCT ON (pr_id, pa_id)

需要配合

ORDER BY pr_id, pa_id, t DESC

首先列出pr_id和pa_id。 c_id = 5是const，但您无法使用索引event_idx (c_id, t DESC, pr_id, pa_id, r)，因为列(pr_id, pa_id, t DESC)不是ORDER BY条款所要求的。如果您的索引至少为(pr_id, pa_id, t DESC)，则不必进行排序，因为ORDER BY条件与此索引一致。

所以这就是我所做的。

CREATE INDEX events_idx2 ON events (c_id, pr_id, pa_id, t DESC, r);

您的内部查询可以使用此索引 - 至少在理论上是这样。不幸的是，查询规划器认为通过使用带有events_idx和c_id的索引x <= t < y来减少行数会更好。 Postgres没有索引提示，因此我们需要一种不同的方式来说服查询规划器采用新索引events_idx2。

强制使用events_idx2的一种方法是使另一个索引更加昂贵。这可以通过从r中删除最后一列events_idx并使其无法用于query_a来完成（至少在不从堆中加载页面时无法使用）。

稍后在索引布局中移动t列是违反直觉的，因为通常会为=和范围选择第一列，其中c_id和t符合条件对于。但是，您的ORDER BY (pr_id, pa_id, t DESC)要求至少将此子集作为索引。当然，我们仍然首先使用c_id来尽快减少行数。

如果需要，您仍然可以在(c_id, t DESC, pr_id, pa_id)上设置索引，但不能在query_a中使用。

以下是query_a使用events_idx2和events_idx已删除的查询计划。当你没有给它们命名时，查找events_c_id_pr_id_pa_id_t_r_idx，这就是PG自动命名索引的方式。我喜欢这种方式，因为我可以在每个查询计划中看到索引名称中列的顺序。

 Sort  (cost=30076.71..30110.75 rows=13618 width=23) (actual time=426.898..426.914 rows=30 loops=1)
   Sort Key: (count(1)), events.r, events.pr_id DESC
   Sort Method: quicksort  Memory: 27kB
   ->  HashAggregate  (cost=29005.43..29141.61 rows=13618 width=23) (actual time=426.820..426.859 rows=30 loops=1)
         Group Key: events.pr_id, events.r
         ->  Unique  (cost=0.56..26622.33 rows=136177 width=40) (actual time=0.037..328.828 rows=117204 loops=1)
               ->  Index Only Scan using events_c_id_pr_id_pa_id_t_r_idx on events  (cost=0.56..25830.50 rows=158366 width=40) (actual time=0.035..178.594 rows=154940 loops=1)
                     Index Cond: ((c_id = 5) AND (t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
                     Heap Fetches: 0
 Planning time: 0.201 ms
 Execution time: 427.017 ms
(11 Zeilen)

计划是即时的，性能是次秒，因为索引与内部查询的ORDER BY匹配。

凭借query_a的良好表现，无需额外的功能即可更快地进行替代查询query_b和query_c。

说明：

不知怎的，我在你的关系中找不到主键。上述提出的解决方案在没有任何主键假设的情我仍然认为你有一些主键，但也许忘记提及它。

自然键是pa_id。每个pa_id指的是“一件事” 记录了约1~30个事件。

如果pa_id与多个c_id相关，那么pa_id本身就不是关键。如果pr_id和r是数据，那么(c_id, pa_id, t)可能是唯一键吗？此外，您的索引events_idx不是唯一的，但是跨越关系的所有列，因此您可以有多个相等的行 - 您想要允许吗？

如果您确实需要索引events_idx和建议的events_idx2，那么您将总共存储3次数据（索引中两次，堆上一次）。

由于这确实是一个棘手的查询优化，我恳请您至少考虑为回答您问题的人添加赏金，因为它已经在SO上没有回答了很长一段时间。

编辑A 我使用上面出色的设置插入了另一组数据，基本上将行数加倍。这个日期从'2017-01-10'开始。所有其他参数保持不变。

以下是时间属性及其查询行为的部分索引。

CREATE INDEX events_timerange ON events (c_id, pr_id, pa_id, t DESC, r) WHERE '2017-01-03' <= t AND t < '2017-01-06';

Sort  (cost=12510.07..12546.55 rows=14591 width=23) (actual time=361.579..361.595 rows=30 loops=1)
   Sort Key: (count(1)), events.r, events.pr_id DESC
   Sort Method: quicksort  Memory: 27kB
   ->  HashAggregate  (cost=11354.99..11500.90 rows=14591 width=23) (actual time=361.503..361.543 rows=30 loops=1)
         Group Key: events.pr_id, events.r
         ->  Unique  (cost=0.55..8801.60 rows=145908 width=40) (actual time=0.026..265.084 rows=118571 loops=1)
               ->  Index Only Scan using events_timerange on events  (cost=0.55..8014.70 rows=157380 width=40) (actual time=0.024..115.265 rows=155800 loops=1)
                     Index Cond: (c_id = 5)
                     Heap Fetches: 0
 Planning time: 0.214 ms
 Execution time: 361.692 ms
(11 Zeilen)

没有索引events_timerange（这是常规的完整索引）。

Sort  (cost=65431.46..65467.93 rows=14591 width=23) (actual time=472.809..472.824 rows=30 loops=1)
   Sort Key: (count(1)), events.r, events.pr_id DESC
   Sort Method: quicksort  Memory: 27kB
   ->  HashAggregate  (cost=64276.38..64422.29 rows=14591 width=23) (actual time=472.732..472.776 rows=30 loops=1)
         Group Key: events.pr_id, events.r
         ->  Unique  (cost=0.56..61722.99 rows=145908 width=40) (actual time=0.024..374.392 rows=118571 loops=1)
               ->  Index Only Scan using events_c_id_pr_id_pa_id_t_r_idx on events  (cost=0.56..60936.08 rows=157380 width=40) (actual time=0.021..222.987 rows=155800 loops=1)
                     Index Cond: ((c_id = 5) AND (t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
                     Heap Fetches: 0
 Planning time: 0.171 ms
 Execution time: 472.925 ms
(11 Zeilen)

使用部分索引，运行时间大约快100ms，同时整个表格大两倍。（注意：第二次绕它只有50ms更快。优势应该增长，记录的事件越多，因为需要完整索引的查询将变得更慢，因为你怀疑（并且我同意））。此外，在我的机器上，两个插入的完整索引是810 MB（创建表格+ 2017-01-10附加）。部分索引WHERE 2017-01-03＆lt; = t＆lt; 2017-01-06只有91 MB。也许你可以按月或按年创建部分指数？根据查询的时间范围，可能只需要索引最近的数据，或者只是部分旧数据？

我还尝试使用WHERE c_id = 5进行部分索引，因此按c_id进行分区。

Sort  (cost=51324.27..51361.47 rows=14880 width=23) (actual time=550.579..550.592 rows=30 loops=1)
   Sort Key: (count(1)), events.r, events.pr_id DESC
   Sort Method: quicksort  Memory: 27kB
   ->  HashAggregate  (cost=50144.21..50293.01 rows=14880 width=23) (actual time=550.481..550.528 rows=30 loops=1)
         Group Key: events.pr_id, events.r
         ->  Unique  (cost=0.42..47540.21 rows=148800 width=40) (actual time=0.050..443.393 rows=118571 loops=1)
               ->  Index Only Scan using events_cid on events  (cost=0.42..46736.42 rows=160758 width=40) (actual time=0.047..269.676 rows=155800 loops=1)
                     Index Cond: ((t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
                     Heap Fetches: 0
 Planning time: 0.366 ms
 Execution time: 550.706 ms
(11 Zeilen)

因此，部分索引也可能是一个可行的选择。如果您获得了更多数据，那么您也可以考虑将分区（例如，年龄为两岁及以上的所有行）划分为单独的表或其他内容。不过，我认为Block Range Indexes BRIN（指数）可能对此有所帮助。如果你的机器比我的机器更强大，那么你可以插入10倍的数据量并首先检查常规完整索引的行为以及它在增加的表上的行为。

Answer 2

<强> [EDITED] 好的，因为这取决于您的数据分布，这是另一种方法。

首先添加以下索引：

CREATE INDEX events_idx2 ON events (c_id, t DESC, pr_id, pa_id, r);

尽可能快地提取MAX(t)，假设子集将小于连接父表。但是，如果数据集不是那么小，它可能会更慢。

SELECT
    e.pr_id,
    e.r,
    count(1) AS quantity
FROM events e
JOIN (
    SELECT
        pr_id,
        pa_id,
        MAX(t) last_t
    FROM events e
    WHERE
        c_id = 5 
        AND t >= '2017-01-03Z00:00:00' 
        AND t < '2017-01-06Z00:00:00'
    GROUP BY 
        pr_id, 
        pa_id
) latest 
    ON (
        c_id = 5 
        AND latest.pr_id = e.pr_id
        AND latest.pa_id = e.pa_id
        AND latest.last_t = e.t
    )
GROUP BY
    e.pr_id,
    e.r
ORDER BY 3, 2, 1 DESC

Full Fiddle

SQL Fiddle

PostgreSQL 9.3架构设置：

--PostgreSQL 9.6
--'\\' is a delimiter

-- CREATE TABLE events AS...

VACUUM  ANALYZE events;
CREATE INDEX idx_events_idx ON events (c_id, t DESC, pr_id, pa_id, r);

查询1 ：

  -- query A
explain analyze SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id,
          pa_id,
          r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        ORDER BY pr_id, pa_id, t DESC
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC

<强> Results ：

QUERY PLAN
Sort  (cost=2170.24..2170.74 rows=200 width=15) (actual time=358.239..358.245 rows=30 loops=1)
Sort Key: (count(1)), events.r, events.pr_id
Sort Method: quicksort  Memory: 27kB
->  HashAggregate  (cost=2160.60..2162.60 rows=200 width=15) (actual time=358.181..358.189 rows=30 loops=1)
->  Unique  (cost=2012.69..2132.61 rows=1599 width=40) (actual time=327.345..353.750 rows=12098 loops=1)
->  Sort  (cost=2012.69..2052.66 rows=15990 width=40) (actual time=327.344..348.686 rows=15966 loops=1)
Sort Key: events.pr_id, events.pa_id, events.t
Sort Method: external merge  Disk: 792kB
->  Index Only Scan using idx_events_idx on events  (cost=0.42..896.20 rows=15990 width=40) (actual time=0.059..5.475 rows=15966 loops=1)
Index Cond: ((c_id = 5) AND (t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
Heap Fetches: 0
Total runtime: 358.610 ms

查询2 ：

  -- query max/JOIN
explain analyze     SELECT
        e.pr_id,
        e.r,
        count(1) AS quantity
    FROM events e
    JOIN (
        SELECT
            pr_id,
            pa_id,
            MAX(t) last_t
        FROM events e
        WHERE
            c_id = 5 
            AND t >= '2017-01-03Z00:00:00' 
            AND t < '2017-01-06Z00:00:00'
        GROUP BY 
            pr_id, 
            pa_id
    ) latest 
        ON (
            c_id = 5 
            AND latest.pr_id = e.pr_id
            AND latest.pa_id = e.pa_id
            AND latest.last_t = e.t
        )
    GROUP BY
        e.pr_id,
        e.r
    ORDER BY 3, 2, 1 DESC

<强> Results ：

QUERY PLAN
Sort  (cost=4153.31..4153.32 rows=1 width=15) (actual time=68.398..68.402 rows=30 loops=1)
Sort Key: (count(1)), e.r, e.pr_id
Sort Method: quicksort  Memory: 27kB
->  HashAggregate  (cost=4153.29..4153.30 rows=1 width=15) (actual time=68.363..68.371 rows=30 loops=1)
->  Merge Join  (cost=1133.62..4153.29 rows=1 width=15) (actual time=35.083..64.154 rows=12098 loops=1)
Merge Cond: ((e.t = (max(e_1.t))) AND (e.pr_id = e_1.pr_id))
Join Filter: (e.pa_id = e_1.pa_id)
->  Index Only Scan Backward using idx_events_idx on events e  (cost=0.42..2739.72 rows=53674 width=40) (actual time=0.010..8.073 rows=26661 loops=1)
Index Cond: (c_id = 5)
Heap Fetches: 0
->  Sort  (cost=1133.19..1137.19 rows=1599 width=36) (actual time=29.778..32.885 rows=12098 loops=1)
Sort Key: (max(e_1.t)), e_1.pr_id
Sort Method: external sort  Disk: 640kB
->  HashAggregate  (cost=1016.12..1032.11 rows=1599 width=36) (actual time=12.731..16.738 rows=12098 loops=1)
->  Index Only Scan using idx_events_idx on events e_1  (cost=0.42..896.20 rows=15990 width=36) (actual time=0.029..5.084 rows=15966 loops=1)
Index Cond: ((c_id = 5) AND (t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
Heap Fetches: 0
Total runtime: 68.736 ms

查询3 ：

DROP INDEX idx_events_idx
CREATE INDEX idx_events_flutter ON events (c_id, pr_id, pa_id, t DESC, r)

查询5 ：

  -- query A + index by flutter
explain analyze SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id,
          pa_id,
          r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        ORDER BY pr_id, pa_id, t DESC
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC

<强> Results ：

QUERY PLAN
Sort  (cost=2744.82..2745.32 rows=200 width=15) (actual time=20.915..20.916 rows=30 loops=1)
Sort Key: (count(1)), events.r, events.pr_id
Sort Method: quicksort  Memory: 27kB
->  HashAggregate  (cost=2735.18..2737.18 rows=200 width=15) (actual time=20.883..20.892 rows=30 loops=1)
->  Unique  (cost=0.42..2707.20 rows=1599 width=40) (actual time=0.037..16.488 rows=12098 loops=1)
->  Index Only Scan using idx_events_flutter on events  (cost=0.42..2627.25 rows=15990 width=40) (actual time=0.036..10.893 rows=15966 loops=1)
Index Cond: ((c_id = 5) AND (t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (t < '2017-01-06 00:00:00'::timestamp without time zone))
Heap Fetches: 0
Total runtime: 20.964 ms

Answer 3

只有两种不同的方法（YMMV）：

-- using a window finction to find the record with the most recent t::
EXPLAIN ANALYZE
SELECT pr_id, r, count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id, pa_id,
                 first_value(r) OVER www AS r
                 -- last_value(r) OVER www AS r
        FROM events
        WHERE c_id = 5
         AND t >= '2017-01-03Z00:00:00'
         AND t < '2017-01-06Z00:00:00'
        WINDOW www AS (PARTITION BY pr_id, pa_id ORDER BY t DESC)

        ORDER BY 1, 2, t DESC
    ) sss
    GROUP BY 1, 2
    ORDER BY 3, 2, 1 DESC
        ;

-- Avoiding the window function; find the MAX via NOT EXISTS() ::

EXPLAIN ANALYZE
SELECT pr_id, r, count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id, pa_id, r
        FROM events e
        WHERE c_id = 5
         AND t >= '2017-01-03Z00:00:00'
         AND t < '2017-01-06Z00:00:00'
         AND NOT EXISTS ( SELECT * FROM events nx
                WHERE nx.c_id = 5 AND nx.pr_id =e.pr_id AND nx.pa_id =e.pa_id
                AND nx.t >= '2017-01-03Z00:00:00'
                AND nx.t < '2017-01-06Z00:00:00'
                AND nx.t > e.t
                )
    ) sss
    GROUP BY 1, 2
    ORDER BY 3, 2, 1 DESC
        ;

注意：第二个查询中可以省略DISTINCT ON，结果已经是唯一的。

Answer 4

我尝试使用带有匹配索引的标准ROW_NUMBER()函数而不是Postgres特定的DISTINCT ON来查找＆＃34;最新的＆＃34;行。

<强>索引

CREATE INDEX ix_events ON events USING btree (c_id, pa_id, pr_id, t DESC, r);

<强>查询

WITH
CTE_RN
AS
(
    SELECT
        pa_id
        ,pr_id
        ,r
        ,ROW_NUMBER() OVER (PARTITION BY c_id, pa_id, pr_id ORDER BY t DESC) AS rn
    FROM events
    WHERE
        c_id = 5
        AND t >= '2017-01-03Z00:00:00'
        AND t < '2017-01-06Z00:00:00'
)
SELECT
    pr_id
    ,r
    ,COUNT(*) AS quantity
FROM CTE_RN
WHERE rn = 1
GROUP BY 
    pr_id
    ,r
ORDER BY quantity, r, pr_id DESC
;

我手头没有Postgres，所以我使用http://rextester.com进行测试。我在数据生成脚本中将scale_factor设置为30，否则rextester需要太长时间。我收到了以下查询计划。应忽略时间组件，但您可以看到没有中间排序，只有最终ORDER BY的排序。见http://rextester.com/GUFXY36037

请在您的硬件和数据上尝试此查询。看看它与您的查询的比较会很有趣。我注意到如果表具有您定义的索引，优化器不会选择此索引。如果您在服务器上看到相同内容，请尝试删除或禁用其他索引以获取我获得的计划。

1   Sort  (cost=158.07..158.08 rows=1 width=44) (actual time=81.445..81.448 rows=30 loops=1)
2     Output: cte_rn.pr_id, cte_rn.r, (count(*))
3     Sort Key: (count(*)), cte_rn.r, cte_rn.pr_id DESC
4     Sort Method: quicksort  Memory: 27kB
5     CTE cte_rn
6       ->  WindowAgg  (cost=0.42..157.78 rows=12 width=88) (actual time=0.204..56.215 rows=15130 loops=1)
7             Output: events.pa_id, events.pr_id, events.r, row_number() OVER (?), events.t, events.c_id
8             ->  Index Only Scan using ix_events3 on public.events  (cost=0.42..157.51 rows=12 width=80) (actual time=0.184..28.688 rows=15130 loops=1)
9                   Output: events.c_id, events.pa_id, events.pr_id, events.t, events.r
10                  Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
11                  Heap Fetches: 15130
12    ->  HashAggregate  (cost=0.28..0.29 rows=1 width=44) (actual time=81.363..81.402 rows=30 loops=1)
13          Output: cte_rn.pr_id, cte_rn.r, count(*)
14          Group Key: cte_rn.pr_id, cte_rn.r
15          ->  CTE Scan on cte_rn  (cost=0.00..0.27 rows=1 width=36) (actual time=0.214..72.841 rows=11491 loops=1)
16                Output: cte_rn.pa_id, cte_rn.pr_id, cte_rn.r, cte_rn.rn
17                Filter: (cte_rn.rn = 1)
18                Rows Removed by Filter: 3639
19  Planning time: 0.452 ms
20  Execution time: 83.234 ms

您还可以进行一项依赖于数据外部知识的优化。

如果您可以保证每对pa_id, pr_id都有每个值，例如每天，那么您可以安全地将用户定义的t范围缩小到仅一天。

如果用户通常指定t超过1天的范围，这将减少引擎读取和排序的行数。

如果您无法在数据中为所有值提供此类保证，但您仍然知道通常所有pa_id, pr_id彼此靠近（t）并且通常是用户为t提供了广泛的范围，您可以运行初步查询来缩小主查询的t范围。

这样的事情：

SELECT
    MIN(MaxT) AS StartT
    MAX(MaxT) AS EndT
FROM
    (
        SELECT
            pa_id
            ,pr_id
            ,MAX(t) AS MaxT
        FROM events
        WHERE
            c_id = 5
            AND t >= '2017-01-03Z00:00:00'
            AND t < '2017-01-06Z00:00:00'
        GROUP BY
            pa_id
            ,pr_id
    ) AS T

然后在主查询中使用找到的StartT,EndT，希望新范围比用户定义的原始范围窄得多。

上面的查询不必对行进行排序，因此它应该很快。主查询必须对行进行排序，但排序的行数会减少，因此整体运行时间可能会更好。

Answer 5

所以我已经采取了一些措施，并尝试将您的分组和不同数据移动到他们的拥有表中，以便我们可以利用多个表索引。请注意，只有在您可以控制数据插入数据库的方式时，此解决方案才有效，即您可以更改数据源应用程序。如果没有，唉这是没有意义的。

实际上，您不应立即插入events表，而应首先检查相关表中是否存在关系date和prpa。如果没有，请创建它们。然后获取他们的id并将其用于插入语句到events表。

在开始之前，我在query_c上的性能比query_a高出10倍，而我重写的query_a的最终结果大约是4倍的性能。如果这还不够好，请随时关闭。

鉴于您在第一个实例中提供的初始数据种子查询，我计算了以下基准：

query_a: 5228.518 ms
query_b: 5708.962 ms
query_c: 538.329 ms

因此，性能增加约10倍，给予或接受。

我要改变events中生成的数据，这种改动需要相当长的时间。在实践中你不需要这样做，因为你的表INSERT已经被覆盖了。

对于我的优化，第一步是创建一个包含日期然后传输数据的表，并在events表中将其与之关联，如下所示：

CREATE TABLE dates (
    id SERIAL,
    year_part INTEGER NOT NULL,
    month_part INTEGER NOT NULL,
    day_part INTEGER NOT NULL
);
-- Total runtime: 8.281 ms

INSERT INTO dates(year_part, month_part, day_part) SELECT DISTINCT
    EXTRACT(YEAR FROM t), EXTRACT(MONTH FROM t), EXTRACT(DAY FROM t)
FROM events;
-- Total runtime: 12802.900 ms

CREATE INDEX dates_ymd ON dates USING btree(year_part, month_part, day_part);
-- Total runtime: 13.750 ms

ALTER TABLE events ADD COLUMN date_id INTEGER;
-- Total runtime: 2.468ms

UPDATE events SET date_id = dates.id
FROM dates
WHERE EXTRACT(YEAR FROM t) = dates.year_part
AND EXTRACT(MONTH FROM t) = dates.month_part
AND EXTRACT(DAY FROM T) = dates.day_part
;
-- Total runtime: 388024.520 ms

接下来，我们会做同样的事情，但是使用密钥对（pr_id，pa_id），它不会过多地降低基数，但是当我们谈论大的时候设置它可以帮助内存使用和交换进出：

CREATE TABLE prpa (
    id SERIAL,
    pr_id TEXT NOT NULL,
    pa_id TEXT NOT NULL
);
-- Total runtime: 5.451 ms

CREATE INDEX events_prpa ON events USING btree(pr_id, pa_id);
-- Total runtime: 218,908.894 ms

INSERT INTO prpa(pr_id, pa_id) SELECT DISTINCT pr_id, pa_id FROM events;
-- Total runtime: 5566.760 ms

CREATE INDEX prpa_idx ON prpa USING btree(pr_id, pa_id);
-- Total runtime: 84185.057 ms

ALTER TABLE events ADD COLUMN prpa_id INTEGER;
-- Total runtime: 2.067 ms

UPDATE events SET prpa_id = prpa.id
FROM prpa
WHERE events.pr_id = prpa.pr_id
AND events.pa_id = prpa.pa_id;
-- Total runtime: 757915.192

DROP INDEX events_prpa;
-- Total runtime: 1041.556 ms

最后，让我们摆脱旧索引和现已解散的列，然后清空新表：

DROP INDEX events_idx;
-- Total runtime: 1139.508 ms

ALTER TABLE events
    DROP COLUMN pr_id,
    DROP COLUMN pa_id
;
-- Total runtime: 5.376 ms

VACUUM ANALYSE prpa;
-- Total runtime: 1030.142

VACUUM ANALYSE dates;
-- Total runtime: 6652.151

所以我们现在有以下表格：

events (c_id, r, t, prpa_id, date_id)
dates (id, year_part, month_part, day_part)
prpa (id, pr_id, pa_id)

现在让我们抛出一个索引，将t DESC推到它所属的一端，我们现在可以这样做，因为我们会在dates之前对ORDER的结果进行过滤}}，这减少了t DESC在索引中如此突出的需求：

CREATE INDEX events_idx_new ON events USING btree (c_id, date_id, prpa_id, t DESC);
-- Total runtime: 27697.795
VACUUM ANALYSE events;

现在我们重写查询，（使用表来存储中间结果 - 我觉得这适用于大型数据集）和awaaaaay我们去了！

DROP TABLE IF EXISTS temp_results;

SELECT DISTINCT ON (prpa_id)
    prpa_id,
    r
INTO TEMPORARY temp_results
FROM events
INNER JOIN dates
    ON dates.id = events.date_id
WHERE c_id = 5
AND dates.year_part BETWEEN 2017 AND 2017
AND dates.month_part BETWEEN 1 AND 1
AND dates.day_part BETWEEN 3 AND 5
ORDER BY prpa_id, t DESC;

SELECT
    prpa.pr_id,
    r,
    count(1) AS quantity
FROM temp_results
INNER JOIN prpa ON prpa.id = temp_results.prpa_id
GROUP BY
    1,
    2
ORDER BY 3, 2, 1 DESC;
-- Total runtime: 1233.281 ms

所以，不是性能提高10倍，而是4倍仍然没问题。

这个解决方案结合了我发现的几种技术，可以很好地处理大型数据集和日期范围。即使它对你的目的不够好，也可能会有一些宝石可以在你的职业生涯中重新定位。

编辑：

在SELECT INTO查询中解析分析：

Unique  (cost=171839.95..172360.53 rows=51332 width=16) (actual time=819.385..857.777 rows=117471 loops=1)
  ->  Sort  (cost=171839.95..172100.24 rows=104117 width=16) (actual time=819.382..836.924 rows=155202 loops=1)
        Sort Key: events.prpa_id, events.t
        Sort Method: external sort  Disk: 3944kB
        ->  Hash Join  (cost=14340.24..163162.92 rows=104117 width=16) (actual time=126.929..673.293 rows=155202 loops=1)
              Hash Cond: (events.date_id = dates.id)
              ->  Bitmap Heap Scan on events  (cost=14338.97..160168.28 rows=520585 width=20) (actual time=126.572..575.852 rows=516503 loops=1)
                    Recheck Cond: (c_id = 5)
                    Heap Blocks: exact=29610
                    ->  Bitmap Index Scan on events_idx2  (cost=0.00..14208.82 rows=520585 width=0) (actual time=118.769..118.769 rows=516503 loops=1)
                          Index Cond: (c_id = 5)
              ->  Hash  (cost=1.25..1.25 rows=2 width=4) (actual time=0.326..0.326 rows=3 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 1kB
                    ->  Seq Scan on dates  (cost=0.00..1.25 rows=2 width=4) (actual time=0.320..0.323 rows=3 loops=1)
                          Filter: ((year_part >= 2017) AND (year_part <= 2017) AND (month_part >= 1) AND (month_part <= 1) AND (day_part >= 3) AND (day_part <= 5))
                          Rows Removed by Filter: 7
Planning time: 3.091 ms
Execution time: 913.543 ms

SELECT查询的EXPLAIN ANALYZE： （注意：我必须更改第一个查询以选择实际表，而不是临时表，以便获得此查询计划.AFAIK EXPLAIN ANALYZE仅适用于单个查询）

Sort  (cost=89590.66..89595.66 rows=2000 width=15) (actual time=1248.535..1248.537 rows=30 loops=1)
  Sort Key: (count(1)), temp_results.r, prpa.pr_id
  Sort Method: quicksort  Memory: 27kB
  ->  HashAggregate  (cost=89461.00..89481.00 rows=2000 width=15) (actual time=1248.460..1248.468 rows=30 loops=1)
        Group Key: prpa.pr_id, temp_results.r
        ->  Hash Join  (cost=73821.20..88626.40 rows=111280 width=15) (actual time=798.861..1213.494 rows=117471 loops=1)
              Hash Cond: (temp_results.prpa_id = prpa.id)
              ->  Seq Scan on temp_results  (cost=0.00..1632.80 rows=111280 width=8) (actual time=0.024..17.401 rows=117471 loops=1)
              ->  Hash  (cost=36958.31..36958.31 rows=2120631 width=15) (actual time=798.484..798.484 rows=2120631 loops=1)
                    Buckets: 16384  Batches: 32  Memory Usage: 3129kB
                    ->  Seq Scan on prpa  (cost=0.00..36958.31 rows=2120631 width=15) (actual time=0.126..350.664 rows=2120631 loops=1)
Planning time: 1.073 ms
Execution time: 1248.660 ms

棘手的postgresql查询优化（具有排序的不同行聚合）

5 个答案:

说明：

Full Fiddle