UNION ALL需要太长时间

时间:2012-05-16 12:15:35

标签: performance postgresql query-optimization union-all

我在多个表上有聚类数据,通常看起来像这样:

CREATE TABLE 2012_03_09 (
    guid_key integer,
    property_key integer,
    instance_id_key integer,
    time_stamp timestamp without time zone,
    "value" double precision
)

使用这些索引:

CREATE INDEX 2012_03_09_a
  ON 2012_03_09
  USING btree
  (guid_key, property_key, time_stamp);

CREATE INDEX 2012_03_09_b
  ON 2012_03_09
  USING btree
  (time_stamp, property_key);

当我分析我的查询时,追加操作的总时间困扰着我。你能解释一下,为什么查询运行时间过长?有什么方法可以优化这样的查询吗?

Sort  (cost=262.50..262.61 rows=47 width=20) (actual time=1918.237..1918.246 rows=100 loops=1)    
  Output: 2012_04_26.time_stamp, 2012_04_26.value, 2012_04_26.instance_id_key    
  Sort Key: 2012_04_26.instance_id_key, 2012_04_26.time_stamp    
  Sort Method:  quicksort  Memory: 32kB    
  ->  Append  (cost=0.00..261.19 rows=47 width=20) (actual time=69.817..1917.848 rows=100 loops=1)    
        ->  Index Scan using 2012_04_26_a on 2012_04_26  (cost=0.00..8.28 rows=1 width=20) (actual time=14.909..14.909 rows=0 loops=1)    
              Output: 2012_04_26.time_stamp, 2012_04_26.value, 2012_04_26.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_04_27_a on 2012_04_27  (cost=0.00..8.28 rows=1 width=20) (actual time=1.535..1.535 rows=0 loops=1)    
              Output: 2012_04_27.time_stamp, 2012_04_27.value, 2012_04_27.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_02_a on 2012_05_02  (cost=0.00..12.50 rows=2 width=20) (actual time=53.370..121.894 rows=6 loops=1)    
              Output: 2012_05_02.time_stamp, 2012_05_02.value, 2012_05_02.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_03_a on 2012_05_03  (cost=0.00..24.74 rows=5 width=20) (actual time=59.136..170.215 rows=11 loops=1)    
              Output: 2012_05_03.time_stamp, 2012_05_03.value, 2012_05_03.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_04_a on 2012_05_04  (cost=0.00..12.47 rows=2 width=20) (actual time=67.458..125.172 rows=5 loops=1)    
              Output: 2012_05_04.time_stamp, 2012_05_04.value, 2012_05_04.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_05_a on 2012_05_05  (cost=0.00..8.28 rows=1 width=20) (actual time=14.112..14.112 rows=0 loops=1)    
              Output: 2012_05_05.time_stamp, 2012_05_05.value, 2012_05_05.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_07_a on 2012_05_07  (cost=0.00..12.46 rows=2 width=20) (actual time=60.549..99.999 rows=4 loops=1)    
              Output: 2012_05_07.time_stamp, 2012_05_07.value, 2012_05_07.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_08_a on 2012_05_08  (cost=0.00..24.71 rows=5 width=20) (actual time=63.367..197.296 rows=12 loops=1)    
              Output: 2012_05_08.time_stamp, 2012_05_08.value, 2012_05_08.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_09_a on 2012_05_09  (cost=0.00..28.87 rows=6 width=20) (actual time=59.596..224.685 rows=15 loops=1)    
              Output: 2012_05_09.time_stamp, 2012_05_09.value, 2012_05_09.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_10_a on 2012_05_10  (cost=0.00..28.85 rows=6 width=20) (actual time=56.995..196.590 rows=13 loops=1)    
              Output: 2012_05_10.time_stamp, 2012_05_10.value, 2012_05_10.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_11_a on 2012_05_11  (cost=0.00..20.59 rows=4 width=20) (actual time=62.761..134.313 rows=8 loops=1)    
              Output: 2012_05_11.time_stamp, 2012_05_11.value, 2012_05_11.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_12_a on 2012_05_12  (cost=0.00..8.28 rows=1 width=20) (actual time=12.018..12.018 rows=0 loops=1)    
              Output: 2012_05_12.time_stamp, 2012_05_12.value, 2012_05_12.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_13_a on 2012_05_13  (cost=0.00..8.28 rows=1 width=20) (actual time=12.286..12.286 rows=0 loops=1)    
              Output: 2012_05_13.time_stamp, 2012_05_13.value, 2012_05_13.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_14_a on 2012_05_14  (cost=0.00..16.58 rows=3 width=20) (actual time=92.161..156.802 rows=6 loops=1)    
              Output: 2012_05_14.time_stamp, 2012_05_14.value, 2012_05_14.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_15_a on 2012_05_15  (cost=0.00..25.03 rows=5 width=20) (actual time=73.636..263.537 rows=14 loops=1)    
              Output: 2012_05_15.time_stamp, 2012_05_15.value, 2012_05_15.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_16_a on 2012_05_16  (cost=0.00..12.56 rows=2 width=20) (actual time=100.893..172.404 rows=6 loops=1)    
              Output: 2012_05_16.time_stamp, 2012_05_16.value, 2012_05_16.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
Total runtime: 1918.745 ms

更新:

也发布SQL查询:

select time_stamp, value, instance_id_key as segment from perf_hourly_2012_04_26 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_04_27 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_02 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_03 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_04 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_05 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_07 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_08 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_09 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_10 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_11 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_12 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_13 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_14 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_15 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_16 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
ORDER BY 3 ASC, 1 ASC 

3 个答案:

答案 0 :(得分:2)

看起来你应该检查Postgresql Partitioning。您的查询将更简单,它可能会表现更好(不是100%肯定,但我认为值得一试)

答案 1 :(得分:2)

除了追加,所有行似乎都获得了第一种类型的索引扫描。我不得不怀疑这是否是最佳指数。由于您似乎选择了重要的时间范围,因此唯一的其他选择是guid_key和property_key。哪个更有选择性?应该首先选择更具选择性的列(也就是说,如果您不担心排序,我认为您不应该对100行进行排序)其次,您是否为此查询或其他查询添加了这些索引?如果它们在其他任何地方都没用,可能有意义删除它们。索引实际上可以降低性能,特别是如果表记录大部分时间都在内存中,因为它们可能需要数据库从内存中卸载记录以加载索引(然后在完成后加载表记录)索引扫描)。

我能给出的唯一真实建议是玩它

编辑:

(当然还有其他问题,为什么这些记录没有某种主键,而且我没有忽略表格本身的聚类,但它们也在这里发挥作用。)

答案 2 :(得分:1)

UNION不是你的计时问题,它报告的经过时间基本上是每个分区的索引扫描时间的总和。您的_a索引看起来对您的查询谓词具有适当的选择性。我在解释分析中看到的实时罪魁祸首是,在每个分区上使用索引扫描检索几行需要很长时间。例如:2012_05_04上的5行125ms。索引扫描应该根据缓存状态和表大小调用0-5寻道,如果数据没有聚集,则每个数据行将有一个搜索。慢速单主轴磁盘应该能够在~10ms内进行搜索和阻塞,因此使用蹩脚存储系统进行扫描的最坏情况大约为100ms,但是使用更常见的7200或10K rpm磁盘和多个磁盘,最差的是假设没有缓存命中的情况应该在50ms以下。有了不错的缓存保留,我希望每个分区的索引扫描不会超过几十毫秒。

这个查询在第一次尝试后立即第二次尝试运行得更快吗?如果是这样,那就指出使用冷缓存作为问题来缓慢存储。运行数据库的存储类型是什么?如果我们谈论缓慢的笔记本电脑驱动器或高延迟网络安装,这将解释可怜的IO性能。索引扫描也可能受到极端索引膨胀的影响。如果由于不正确的真空方案对数据进行更新/删除流失而有数十个或数百个死指数条目,那么这可能是罪魁祸首。这些表是否定期进行真空分析和分析?

正如Adrian Serafin建议的那样,你应该研究Pg的表分区功能。