Question

My Postgres版本：“PostgreSQL 9.4.1，由Visual C ++构建编译 1800,32位“
我要处理的表格;包含列
1. eventtime - 没有时区的时间戳
2. 序列号 - 字符变化（32）
3. sourceid - 整数

和其他4个栏目

这是我的选择陈述

SELECT eventtime, serialnumber
    FROM   t_el_eventlog
    WHERE
    eventtime at time zone 'CET'  >  CURRENT_DATE  and
    sourceid = '14';

上述查询的执行时间为59647ms

在我的r脚本中，我有5种这样的查询（执行时间= 59647ms * 5）。没有使用时区'CET'，执行时间非常短 - 但在我的情况下，我必须使用时区'CET'，如果我是对的，那么高执行时间是因为这些时区。

我的查询计划

enter image description here

文字查询 enter image description here

解释分析查询（不含时区） enter image description here

无论如何，我可以减少select语句的查询执行时间

Answer 1

由于我不知道值的分布，因此没有明确的方法来解决问题。

但有一个问题很明显：eventtime列有一个索引，但由于查询使用该列上的函数进行操作，因此无法使用索引。

eventtime in time zone 'UTC' > CURRENT_DATE

必须删除索引并使用该函数重新创建索引，否则必须重写查询。

重新创建索引（示例）：

CREATE INDEX ON t_el_eventlog (timezone('UTC'::text, eventtime));

（这与eventtime in time zone 'UTC'）

相同

这将过滤器与函数匹配，可以使用索引。

我怀疑sourceid没有很好的分布，没有非常不同的值。在这种情况下，删除sourceid上的索引并在eventtime上删除索引并在eventtime和sourceid上创建一个新索引可能是一个想法：

CREATE INDEX ON t_el_eventlog (timezone('UTC'::text, eventtime), sourceid);

这就是理论告诉我们的。我做了一些测试，有一个大约1000万行的表，36小时内的事件时间分布和只有20个不同的源（1..20）。分布非常随机。最好的结果是在事件时间索引，sourceid（无函数索引）和调整查询。

CREATE INDEX ON t_el_eventlog (eventtime, sourceid);
-- make sure there is no index on source id. we need to force postgres to this index.

-- make sure, postgres learns about our index
ANALYZE; VACUUM;

-- use timezone function on current date (guessing timezone is CET)
SELECT * FROM t_el_eventlog
 WHERE eventtime > timezone('CET',CURRENT_DATE) AND sourceid = 14;

如果表格中包含10＆＃000＆000;＆＃39; 000行，此查询仅在400毫秒内返回500＆＃39,000行。（而不是在所有其他组合中大约1400到1700）。

找到索引和查询之间的最佳匹配是任务。我建议进行一些研究，建议是http://use-the-index-luke.com

这是最后一种方法的查询计划：

Index Only Scan using evlog_eventtime_sourceid_idx on evlog  (cost=0.45..218195.13 rows=424534 width=0)
     Index Cond: ((eventtime > timezone('CET'::text, (('now'::cstring)::date)::timestamp with time zone)) AND (sourceid = 14))

如你所见，这是一场完美的比赛......

PostgreSQL - 如何减少选择语句执行时间

1 个答案: