Question

我遇到了奇怪的PostgreSQL行为。我已根据时间将历史记录表分成更小的部分历史 - ＆gt; History_part_YYYY-MM

Check constraints:
"History_part_2013-11_sentdate_check" CHECK (sentdate >= '2013-11-01 00:00:00-04'::timestamp with time zone AND sentdate < '2013-12-01 00:00:00-05'::timestamp with time zone)

继承：“历史”

每个分区在transaction_id列上都有自己的索引。

History_part_2013-11_transaction_id_idx" btree (transaction_id)

就postgres教程而言，据我所知，“没有什么特别的”分区方式。

执行此查询的问题很慢：

SELECT * FROM "History" WHERE transaction_id = 'MMS-dev-23599-2013-12-11-13:03:53.349735' LIMIT 1;

我能够缩小问题的范围，如果第二次运行速度很快，那么每个脚本的查询速度只有第一次。如果它在单独的脚本中再次运行它再次变慢并且第二次运行（在脚本中）将再次快速...我真的没有解释这个。它不在任何交易中。

以下是在同一脚本中逐个运行的两个查询的示例执行时间：

1.33s   SELECT * FROM "History" WHERE transaction_id = 'MMS-dev-14970-2013-12-11-13:18:29.889376' LIMIT 1;...
0.019s  SELECT * FROM "History" WHERE transaction_id = 'MMS-dev-14970-2013-12-11-13:18:29.889376' LIMIT 1;

第一个问题是触发'解析分析'调用的慢，看起来像这样（而且真的很快）：

    Limit  (cost=0.00..8.07 rows=1 width=2589) (actual time=0.972..0.973 rows=1 loops=1)
  ->  Result  (cost=0.00..581.07 rows=72 width=2589) (actual time=0.964..0.964 rows=1 loops=1)
        ->  Append  (cost=0.00..581.07 rows=72 width=2589) (actual time=0.958..0.958 rows=1 loops=1)
              ->  Seq Scan on "History"  (cost=0.00..1.00 rows=1 width=3760) (actual time=0.015..0.015 rows=0 loops=1)
                    Filter: ((transaction_id)::text = 'MMS-dev-23595-2013-12-11-13:20:10.422306'::text)
              ->  Index Scan using "History_part_2013-10_transaction_id_idx" on "History_part_2013-10" "History"  (cost=0.00..8.28 rows=1 width=1829) (actual time=0.040..0.040 rows=0 loops=1)
                    Index Cond: ((transaction_id)::text = 'MMS-dev-23595-2013-12-11-13:20:10.422306'::text)
              ->  Index Scan using "History_part_2013-02_transaction_id_idx" on "History_part_2013-02" "History"  (cost=0.00..8.32 rows=1 width=1707) (actual time=0.021..0.021 rows=0 loops=1)
                    Index Cond: ((transaction_id)::text = 'MMS-dev-23595-2013-12-11-13:20:10.422306'::text)

.... 并检查所有表格（现在大约54个表 - 未来几个表格都是空的）和最后的表格

->  Index Scan using "History_part_2014-10_transaction_id_idx" on "History_part_2014-10" "History"  (cost=0.00..8.27 rows=1 width=3760) (never executed)
                    Index Cond: ((transaction_id)::text = 'MMS-dev-23595-2013-12-11-13:20:10.422306'::text)

Total runtime: 6.390 ms

总运行时间为0,006秒，第一个查询总是高于1秒 - 如果有更多的并发脚本运行（每个脚本都有UNIQUE transaction_id），第一次执行可以达到20秒，第二次执行只需几毫秒。

有没有人经历过这种情况？我想知道我在做什么不对，或者这可能是postgres问题吗？

我将postgres从9.2.4升级 - ＆gt; 9.2.5 - 看起来好一点但问题肯定仍然存在。

更新：我现在使用这个查询：

SELECT * FROM "History" WHERE transaction_id = 'MMS-live-15425-2013-18-11-17:32:20.917198' AND sentdate>='2013-10-18' AND sentdate<'2013-11-19' LIMIT 1

第一次在脚本中运行 - 当对这个表一次运行许多查询时，3到8个SECONDS（如果一次只有脚本，它会快得多）。

当我将脚本中的第一个查询更改为（直接调用分区表）时：

SELECT * FROM "History_part_2013-11" WHERE transaction_id = 'MMS-live-15425-2013-18-11-17:32:20.917198' AND sentdate>='2013-10-18' AND sentdate<'2013-11-19' LIMIT 1

它就像0.03s - 要快得多但是对于“历史”表使用查询的脚本中的下一个查询仍然是3-8 SECONDS。

以下是针对“历史”的第一个查询的解释分析

    Limit  (cost=0.00..25.41 rows=1 width=2540) (actual time=0.129..0.130 rows=1 loops=1)
  ->  Result  (cost=0.00..76.23 rows=3 width=2540) (actual time=0.121..0.121 rows=1 loops=1)
        ->  Append  (cost=0.00..76.23 rows=3 width=2540) (actual time=0.117..0.117 rows=1 loops=1)
              ->  Seq Scan on "History"  (cost=0.00..58.00 rows=1 width=3750) (actual time=0.060..0.060 rows=0 loops=1)
                    Filter: ((sentdate >= '2013-10-18 00:00:00-04'::timestamp with time zone) AND (sentdate < '2013-11-19 00:00:00-05'::timestamp with time zone) AND ((transaction_id)::text = 'MMS-live-15425-2013-18-11-17:32:20.917198'::text))
              ->  Index Scan using "History_part_2013-11_transaction_id_idx" on "History_part_2013-11" "History"  (cost=0.00..8.36 rows=1 width=1985) (actual time=0.051..0.051 rows=1 loops=1)
                    Index Cond: ((transaction_id)::text = 'MMS-live-15425-2013-18-11-17:32:20.917198'::text)
                    Filter: ((sentdate >= '2013-10-18 00:00:00-04'::timestamp with time zone) AND (sentdate < '2013-11-19 00:00:00-05'::timestamp with time zone))
              ->  Index Scan using "History_part_2013-10_transaction_id_idx" on "History_part_2013-10" "History"  (cost=0.00..9.87 rows=1 width=1884) (never executed)
                    Index Cond: ((transaction_id)::text = 'MMS-live-15425-2013-18-11-17:32:20.917198'::text)
                    Filter: ((sentdate >= '2013-10-18 00:00:00-04'::timestamp with time zone) AND (sentdate < '2013-11-19 00:00:00-05'::timestamp with time zone))
Total runtime: 0.572 ms

似乎总是在对'main'历史记录表运行时总是很慢（但不是直接调用分区时）并且只是第一次 - 这是一些兑现的事情吗？但是，为什么直接调用分区要快得多 - 调用主历史表不再检查所有表。

Answer 1

请参阅上面的注释，分区条件（sentdate）必须包含在查询中，并且必须是一个常量表达式，才能使分区排除工作。

分区表上的postgresql索引获取

1 个答案: