Postgres在大表上查询性能不佳

时间:2017-03-15 12:50:19

标签: sql postgresql

我有一个按天划分的事务表。在大型环境中,一天的分区占用5Gb磁盘空间和大约5,000,000行。

以下24个时间范围的查询耗时超过5分钟,并且正在使用索引。

可以采取哪些措施来改善这种情况?

EXPLAIN ANALYZE
SELECT * FROM transactions
WHERE end_time > 1488970800000
  AND end_time <= 1489057200000
  AND synthetic_application_id = 1
ORDER BY insertion_time DESC
LIMIT 2000;
  • 我删除了不包含此时间范围的分区的说明,因为它几乎没有时间在那里。
Limit  (cost=257809.85..257814.85 rows=2000 width=485) (actual time=323745.024..323758.412 rows=2000 loops=1)
  ->  Sort  (cost=257809.85..257818.83 rows=3592 width=485) (actual time=323745.008..323749.762 rows=2000 loops=1)
        Sort Key: transactions.insertion_time
        Sort Method: top-N heapsort  Memory: 1628kB
        ->  Append  (cost=0.00..257597.73 rows=3592 width=485) (actual time=879.457..323670.299 rows=4608 loops=1)
              ->  Seq Scan on transactions  (cost=0.00..0.00 rows=1 width=2646) (actual time=0.004..0.004 rows=0 loops=1)
                    Filter: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))

              ->  Index Scan using transactions_p2017_end_time_applicati_idx13 on transactions_p2017_03_08  (cost=0.56..123142.03 rows=1698 width=470) (actual time=879.085..167714.455 rows=2112 loops=1)
                    Index Cond: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))
              ->  Index Scan using transactions_p2017_end_time_applicati_idx14 on transactions_p2017_03_09  (cost=0.56..134271.47 rows=1871 width=490) (actual time=395.117..155920.754 rows=2496 loops=1)
                    Index Cond: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))

Planning time: 198.866 ms
Execution time: 323765.693 ms

使用explain(analyze,buffers,timing)添加另一个查询, 可能有些数据已经加载到缓存中,因此数字更好。 (据我所知,没有办法清除Windows上的缓存)

"Limit  (cost=227818.94..227823.94 rows=2000 width=474) (actual time=139343.951..139356.216 rows=2000 loops=1)"
"  Buffers: shared hit=795 read=40933 written=246"
"  ->  Sort  (cost=227818.94..227830.39 rows=4579 width=474) (actual   time=139343.943..139348.214 rows=2000 loops=1)"
"        Sort Key: transactions.insertion_time"
"        Sort Method: top-N heapsort  Memory: 1628kB"
"        Buffers: shared hit=795 read=40933 written=246"
"        ->  Append  (cost=0.00..227544.98 rows=4579 width=474) (actual time=733.521..139240.611 rows=4608 loops=1)"
"              Buffers: shared hit=795 read=40933 written=246"
"              ->  Seq Scan on transactions  (cost=0.00..0.00 rows=1 width=2646) (actual time=0.004..0.004 rows=0 loops=1)"
"                    Filter: ((end_time > 1488891600000::bigint) AND (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"           
"              ->  Index Scan using transactions_p2017_end_time_applicati_idx12 on transactions_p2017_03_07  (cost=0.56..101500.07 rows=2134 width=471) (actual time=733.351..120950.487 rows=1728 loops=1)"
"                    Index Cond: ((end_time > 1488891600000::bigint) AND  (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"                    Buffers: shared hit=263 read=19902 written=123"
"              ->  Index Scan using  transactions_p2017_end_time_applicati_idx13 on transactions_p2017_03_08  (cost=0.56..125860.68 rows=2422 width=470) (actual time=114.143..18262.152 rows=2880 loops=1)"
"                    Index Cond: ((end_time > 1488891600000::bigint) AND (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"                    Buffers: shared hit=498 read=21011 written=123"
"             
"Planning time: 23.858 ms"
"Execution time: 139362.264 ms"

3 个答案:

答案 0 :(得分:1)

(synthetic_application_id, end_time)上创建一个索引,看看是否可以改善索引扫描时间。

你的存储速度似乎很慢。

答案 1 :(得分:0)

这是一份清单:

    所有相关表格上的
  1. vacuum analyze verbose。请注意它所说的行数是不可删除的。另外,看看性能是否有所改善。
  2. explain analyze verbose寻找的信息远远超过我们目前所能看到的信息。
  3. 尝试两个索引订单(您拥有的订单和Laurenz建议的那个)并查看使用的订单。
  4. 我还想添加一个解释Laurentz答案的说明以及为什么这可以解决您的问题。

    如果你的索引在end_time,application_id上,那么它将检查application_ids范围内的每个end_time,你很可能会有很多未命中。另一方面,如果你可以先检查application_id,那么也许你可以避免检查很多end_time记录。所以这可以解决你的问题。 (如果你觉得这有用,你应该投票或接受他的答案。)

    但请注意,您需要在每个分区上创建索引。

答案 2 :(得分:-1)

检查Postgres的work_mem设置。如果它无法将整个索引加载到内存中,您可能会遇到磁盘颠簸,这将使速度降低很多。