Question

我有一个按天划分的事务表。在大型环境中，一天的分区占用5Gb磁盘空间和大约5,000,000行。

以下24个时间范围的查询耗时超过5分钟，并且正在使用索引。

可以采取哪些措施来改善这种情况？

EXPLAIN ANALYZE
SELECT * FROM transactions
WHERE end_time > 1488970800000
  AND end_time <= 1489057200000
  AND synthetic_application_id = 1
ORDER BY insertion_time DESC
LIMIT 2000;

我删除了不包含此时间范围的分区的说明，因为它几乎没有时间在那里。

Limit  (cost=257809.85..257814.85 rows=2000 width=485) (actual time=323745.024..323758.412 rows=2000 loops=1)
  ->  Sort  (cost=257809.85..257818.83 rows=3592 width=485) (actual time=323745.008..323749.762 rows=2000 loops=1)
        Sort Key: transactions.insertion_time
        Sort Method: top-N heapsort  Memory: 1628kB
        ->  Append  (cost=0.00..257597.73 rows=3592 width=485) (actual time=879.457..323670.299 rows=4608 loops=1)
              ->  Seq Scan on transactions  (cost=0.00..0.00 rows=1 width=2646) (actual time=0.004..0.004 rows=0 loops=1)
                    Filter: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))

              ->  Index Scan using transactions_p2017_end_time_applicati_idx13 on transactions_p2017_03_08  (cost=0.56..123142.03 rows=1698 width=470) (actual time=879.085..167714.455 rows=2112 loops=1)
                    Index Cond: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))
              ->  Index Scan using transactions_p2017_end_time_applicati_idx14 on transactions_p2017_03_09  (cost=0.56..134271.47 rows=1871 width=490) (actual time=395.117..155920.754 rows=2496 loops=1)
                    Index Cond: ((end_time > 1488970800000::bigint) AND (end_time <= 1489057200000::bigint) AND (application_id = 1))

Planning time: 198.866 ms
Execution time: 323765.693 ms

使用explain（analyze，buffers，timing）添加另一个查询，可能有些数据已经加载到缓存中，因此数字更好。（据我所知，没有办法清除Windows上的缓存）

"Limit  (cost=227818.94..227823.94 rows=2000 width=474) (actual time=139343.951..139356.216 rows=2000 loops=1)"
"  Buffers: shared hit=795 read=40933 written=246"
"  ->  Sort  (cost=227818.94..227830.39 rows=4579 width=474) (actual   time=139343.943..139348.214 rows=2000 loops=1)"
"        Sort Key: transactions.insertion_time"
"        Sort Method: top-N heapsort  Memory: 1628kB"
"        Buffers: shared hit=795 read=40933 written=246"
"        ->  Append  (cost=0.00..227544.98 rows=4579 width=474) (actual time=733.521..139240.611 rows=4608 loops=1)"
"              Buffers: shared hit=795 read=40933 written=246"
"              ->  Seq Scan on transactions  (cost=0.00..0.00 rows=1 width=2646) (actual time=0.004..0.004 rows=0 loops=1)"
"                    Filter: ((end_time > 1488891600000::bigint) AND (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"           
"              ->  Index Scan using transactions_p2017_end_time_applicati_idx12 on transactions_p2017_03_07  (cost=0.56..101500.07 rows=2134 width=471) (actual time=733.351..120950.487 rows=1728 loops=1)"
"                    Index Cond: ((end_time > 1488891600000::bigint) AND  (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"                    Buffers: shared hit=263 read=19902 written=123"
"              ->  Index Scan using  transactions_p2017_end_time_applicati_idx13 on transactions_p2017_03_08  (cost=0.56..125860.68 rows=2422 width=470) (actual time=114.143..18262.152 rows=2880 loops=1)"
"                    Index Cond: ((end_time > 1488891600000::bigint) AND (end_time <= 1488978000000::bigint) AND (application_id = 1))"
"                    Buffers: shared hit=498 read=21011 written=123"
"             
"Planning time: 23.858 ms"
"Execution time: 139362.264 ms"

Answer 1

在(synthetic_application_id, end_time)上创建一个索引，看看是否可以改善索引扫描时间。

你的存储速度似乎很慢。

Answer 2

这是一份清单：

vacuum analyze verbose。请注意它所说的行数是不可删除的。另外，看看性能是否有所改善。
explain analyze verbose寻找的信息远远超过我们目前所能看到的信息。
尝试两个索引订单（您拥有的订单和Laurenz建议的那个）并查看使用的订单。

我还想添加一个解释Laurentz答案的说明以及为什么这可以解决您的问题。

如果你的索引在end_time，application_id上，那么它将检查application_ids范围内的每个end_time，你很可能会有很多未命中。另一方面，如果你可以先检查application_id，那么也许你可以避免检查很多end_time记录。所以这可以解决你的问题。（如果你觉得这有用，你应该投票或接受他的答案。）

但请注意，您需要在每个分区上创建索引。

Answer 3

检查Postgres的work_mem设置。如果它无法将整个索引加载到内存中，您可能会遇到磁盘颠簸，这将使速度降低很多。

Postgres在大表上查询性能不佳

3 个答案: