简单查询的性能不佳

时间:2015-09-04 16:20:17

标签: sql performance postgresql

我在一个函数中有一个查询来选择最上一行,另一个查询用于最后一行,每个查询需要大约300ms来执行,并且这个查询执行很多次使得该函数无用

这是查询(这是一个测试,在函数参数中更改):

SELECT the_geom
FROM "Entries" 
WHERE taxiid= 366 and timestamp between '2008-02-06 16:00:00' and timestamp '2008-02-06 16:00:00' + interval '5 minutes' 
ORDER BY entryid DESC 
LIMIT 1;;

这是查询的EXPLAIN ANALYZE输出:

QUERY PLAN                                                           

---------------------------------------------------------------------     -------------------------------------------------------------------------------------------------------
Seq Scan on "Entries"  (cost=0.00..63538.80 rows=70 width=51) (actual time=184.409..342.049 rows=56 loops=1)
Filter: (("timestamp" >= '2008-02-06 16:00:00'::timestamp without time zone) AND ("timestamp" <= '2008-02-06 16:05:00'::timestamp without time zone
) AND (taxiid = 366))
Rows Removed by Filter: 2128847
Planning time: 0.191 ms
Execution time: 342.088 ms
(5 rows)

有没有更好的方法来获得最高和最后一排?

修改 感谢Drunix,这确实有所帮助但是,我无法理解的事情正在发生,索引你起诉我能够从~300毫秒变为0.2毫秒

但是如果我将添加到时间戳的时间间隔更改为120分钟,则不会使用索引并且它会持续300毫秒

这是证据(5分钟间隔):

snowflake=# explain analyze Select the_geom from "Entries" 
where taxiid= 366 and "timestamp" between '2008-02-06 16:00:00' and "timestamp" '2008-02-06 16:00:00' + interval '5 minutes'
ORDER BY entryid ASC 
LIMIT 1;

QUERY PLAN                                                   

-------------------------------------------------------------------------
Limit  (cost=149.52..149.52 rows=1 width=55) (actual time=0.129..0.129 rows=1 loops=1)
->  Sort  (cost=149.52..149.70 rows=73 width=55) (actual time=0.127..0.127 rows=1 loops=1)
     Sort Key: entryid
     Sort Method: top-N heapsort  Memory: 25kB
     ->  Index Scan using entriesindex on "Entries"  (cost=0.43..149.15 rows=73 width=55) (actual time=0.045..0.090 rows=56 loops=1)
           Index Cond: ((taxiid = 366) AND ("timestamp" >= '2008-02-06 16:00:00'::timestamp without time zone) AND ("timestamp" <= '2008-02-06 16:
05:00'::timestamp without time zone))
Planning time: 0.266 ms
Execution time: 0.180 ms
(8 rows)

另一个(间隔120分钟):

snowflake=# explain analyze Select the_geom from "Entries" 
where taxiid= 366 and "timestamp" between '2008-02-06 16:00:00' and "timestamp" '2008-02-06 16:00:00' + interval '120 minutes' 
ORDER BY entryid ASC 
LIMIT 1;

QUERY PLAN                                                        

-------------------------------------------------------------------------
Limit  (cost=0.43..60.02 rows=1 width=55) (actual time=245.570..245.570 rows=1 loops=1)
->  Index Scan using "Entries_pkey" on "Entries"  (cost=0.43..97542.75 rows=1637 width=55) (actual time=245.568..245.568 rows=1 loops=1)
     Filter: (("timestamp" >= '2008-02-06 16:00:00'::timestamp without time zone) AND ("timestamp" <= '2008-02-06 18:00:00'::timestamp without tim
e zone) AND (taxiid = 366))
     Rows Removed by Filter: 853963
Planning time: 0.277 ms
Execution time: 245.616 ms

1 个答案:

答案 0 :(得分:1)

好的,将我的评论改为答案:

除非你已经拥有它,否则你应该创建一个复合索引:

create index somename on Entries(taxiid, timestamp);

根据您的执行计划,这些字段的组合应该是相当有选择性的,因此索引扫描应该更有效。请注意,(timestamp, taxiid)上的索引可能不太有用,因为它仅用于按时间戳限制行。在类似的情况下,将检查相等的列放在前面。