Postgresql解释计划差异

时间:2014-01-13 19:58:46

标签: sql postgresql sql-execution-plan

这是我的第一篇文章......

我的查询时间比我想要的要长(不是我们都要!) 根据我在WHERE子句中的内容...它可以运行得更快。 我试图理解为什么查询计划是不同的 和 我能做些什么来加快查询速度。

这是查询#1:

SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND date_observed 
BETWEEN '2013-12-01 07:45:00+00'::timestamptz
 AND '2014-01-01 07:59:59+00'::timestamptz
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

Here's Query Plan #1

"Hash Semi Join  (cost=11.65..5640243.59 rows=92194 width=16) (actual time=34.947..132522.023 rows=43609 loops=1)"
"  Hash Cond: (device_read_data.fk_device_rw_id = equipment_set_rw.fk_device_rw_id)"
"  ->  Seq Scan on device_read_data  (cost=0.00..5449563.56 rows=72157042 width=32) (actual time=0.844..123760.331 rows=71764376 loops=1)"
"        Filter: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone)    AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone) AND   ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text))"
"        Rows Removed by Filter: 82135660"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=0.018..0.018 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..11.61 rows=3 width=16) (actual time=0.016..0.016 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=0.011..0.011 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 132530.290 ms"

这是查询#2:

SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND date_observed 
BETWEEN  '2014-01-01 07:45:00+00'::timestamptz
 AND '2014-02-01 07:59:59+00'::timestamptz
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

Here's Query Plan #2

"Nested Loop  (cost=4.27..1869543.46 rows=20391 width=16) (actual time=0.041..2053.656 rows=12997 loops=1)"
"  ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..9.73 rows=2 width=16) (actual time=0.015..0.017 rows=1 loops=1)"
"        Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"        ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=2 width=0) (actual time=0.010..0.010 rows=1 loops=1)"
"              Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"  ->  Index Scan using idx_device_read_data_date_observed_fk_device_rw_id on device_read_data  (cost=0.00..934664.91 rows=10195 width=32) (actual time=0.024..2050.656 rows=12997 loops=1)"
"        Index Cond: ((date_observed >= '2014-01-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-02-01 07:59:59+00'::timestamp with time zone) AND (fk_device_rw_id = equipment_set_rw.fk_device_rw_id))"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"Total runtime: 2055.068 ms"

我只更改了Where子句中的Date Range。 您可以看到在查询#1中,表中存在Seq扫描VS查询#2中的索引扫描。

我正在试图确定导致这种情况的原因,但我似乎无法找到答案。

其他信息

  • (date_observed,fk_device_rw_id)
  • 上有复合索引
  • 此表上从不删除任何内容。不需要自动清空。
  • 无论如何我把桌子弄脏了......但这没有效果。
  • 我在这张桌子上重建了索引
  • 我已经分析了这张表
  • 此系统是Prod的副本,目前处于空闲状态

系统信息

  • 在Linux上运行Postgres 9.2
  • 16GB System Ram
  • Shared_Buffers设置为4GB

我可以提供哪些其他信息?我确信我遗漏了一些东西。

感谢您的帮助。

编辑1

我试过:设置enable_seqscan = false

以下是解释计划结果:

"Hash Semi Join  (cost=2566484.50..7008502.81 rows=92194 width=16) (actual  time=18587.453..182228.966 rows=43609 loops=1)"
"  Hash Cond: (device_read_data.fk_device_rw_id = equipment_set_rw.fk_device_rw_id)"
"  ->  Bitmap Heap Scan on device_read_data  (cost=2566472.85..6817822.78 rows=72157042 width=32) (actual time=18562.247..172074.048 rows=71764376 loops=1)"
"        Recheck Cond: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"        Rows Removed by Index Recheck: 2102"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"        Rows Removed by Filter: 12265137"
"        ->  Bitmap Index Scan on idx_device_read_data_date_observed_fk_device_rw_id  (cost=0.00..2548433.59 rows=85430682 width=0) (actual time=18556.228..18556.228 rows=84029513 loops=1)"
"              Index Cond: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=16.134..16.134 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..11.61 rows=3 width=16) (actual time=16.128..16.129 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=16.116..16.116 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 182244.181 ms"

正如预测的那样,查询耗时更长。 是否还有记录可以使这更快?

我的选择是什么?

感谢。

编辑2

我尝试了重写方法。我担心结果与原版相似。 这是查询计划:

"Hash Join  (cost=11.65..6013386.19 rows=90835 width=16) (actual time=35.272..127965.785 rows=43609 loops=1)"
"  Hash Cond: (a.fk_device_rw_id = b.fk_device_rw_id)"
"  ->  Seq Scan on device_read_data a  (cost=0.00..5565898.74 rows=71450793 width=32) (actual time=13.050..119667.814 rows=71764376 loops=1)"
"        Filter: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone) AND ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text))"
"        Rows Removed by Filter: 85426425"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=0.018..0.018 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw b  (cost=4.27..11.61 rows=3 width=16) (actual time=0.015..0.016 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=0.011..0.011 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 127992.849 ms"

这似乎是一个简单的问题。从表中返回特定日期范围内的记录。鉴于我现有的系统架构,也许在性能受到不利影响之前,表中可能存在多少条记录的阈值。

除非有其他建议,否则我可能需要采用分区方法。

感谢您的帮助!

2 个答案:

答案 0 :(得分:1)

在您的第一个查询中,您的日期范围跨越整整一个月,而不是第二个查询中的一天。第一个查询中的日期范围与device_read_data中大约154M行中的72M行匹配,这几乎是该表中行的一半。

索引扫描通常比那些行的全表扫描要慢(因为索引扫描必须读取索引页和数据页,获取那么多行所需的磁盘读取总数可能大于只读取每个数据页)。

您可以在运行第一个查询之前set enable_seq_scan = false查看差异,如果您感觉冒险,请将解释运行为explain (analyze, buffers) <query>,以查看在进行表扫描时获得的块读取次数与索引扫描。

编辑:对于您的特定问题,您可能会使用部分索引。你必须弄清楚如何构建它们,以便它们尽可能广泛地进行网络扩展(每个问题编写一个部分索引很有诱惑力但很浪费)但你可能会从这样的事情开始:

create index idx_device_read_data_date_observed_base_value
on device_read_data (date_observed)
where base_value ~ '[0-9]+(\.[0-9]+)?'
;

该索引仅为那些匹配base_value模式的行构建。如果这是一个相当严格的条件,你会比我们知道的更好(如果它确实减少了要考虑的行数,它对你有好处。)

你也可以在匹配该模式的base_value上翻转这个想法和索引,并使你的where条件类似于date_observed between '2013-12-01 and '2013-12-31',每个月添加一个这样的索引(这种方式很可能与索引失控 - 我会切换到分区)。

另一个潜在的改进可能来自重写您的查询。这是一种消除IN条件的方法,如果给定fk_device_rw_id的{​​{1}}中没有重复equipment_set_rw,则会提供相同的结果。

fk_equipment_set_id

答案 1 :(得分:1)

我尝试了一些事情,现在我对表现感到满意。

我将device_read_data表的索引更改为与其相反的顺序。

原始索引:

CREATE UNIQUE INDEX idx_device_read_data_date_observed_fk_device_rw_id
  ON device_read_data
  USING btree (date_observed, fk_device_rw_id);

新索引:

CREATE UNIQUE INDEX idx_device_read_data_date_observed_fk_device_rw_id
  ON device_read_data
  USING btree (fk_device_rw_id, date_observed);

fk_device_rw_id列的基数要低得多。将此列放在索引的第一位有助于更快地过滤记录。

此外,请确保where子句中的列与复合索引的顺序相同。 (现在是这种情况。)

我更改了date_observed列的统计信息。从而为查询规划人员提供了更多信息。

最初它使用的是postgres默认值100.我将其设置为:

ALTER TABLE device_read_data ALTER COLUMN date_observed SET STATISTICS 1000;

以下是查询结果。很多......快得多。 我可以通过其他统计数据进一步调整这一点......但是,现在这种方法很有用。我或许能够推迟分区。

感谢您的帮助。

查询:

explain Analyze
SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND (date_observed >= '2013-12-01 07:45:00+00'::timestamptz AND date_observed <= '2014-        01-01 07:59:59+00'::timestamptz)
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

新的查询计划:

"Nested Loop  (cost=1197.25..264699.54 rows=59694 width=16) (actual time=25.876..493.073 rows=43609 loops=1)"
"  ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..9.73 rows=2 width=16) (actual time=0.018..0.019 rows=1 loops=1)"
"        Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"        ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=2 width=0) (actual time=0.012..0.012 rows=1 loops=1)"
"              Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"  ->  Bitmap Heap Scan on device_read_data  (cost=1192.99..132046.43 rows=29847 width=32) (actual time=25.849..486.701 rows=43609 loops=1)"
"        Recheck Cond: ((fk_device_rw_id = equipment_set_rw.fk_device_rw_id) AND (date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"        Rows Removed by Index Recheck: 2076173"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"        ->  Bitmap Index Scan on idx_device_read_data_date_observed_fk_device_rw_id  (cost=0.00..1185.53 rows=35640 width=0) (actual time=24.000..24.000 rows=43609 loops=1)"
"              Index Cond: ((fk_device_rw_id = equipment_set_rw.fk_device_rw_id) AND (date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"Total runtime: 495.506 ms"