Question

我有一个非常大的表，有近100万行，有些查询正在花费长时间（超过一分钟）。

这是一个让我特别难过的人......

EXPLAIN ANALYZE SELECT "apps".* FROM "apps" WHERE "apps"."kind" = 'software' ORDER BY itunes_release_date DESC, rating_count DESC LIMIT 12;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=153823.03..153823.03 rows=12 width=2091) (actual time=162681.166..162681.194 rows=12 loops=1)
   ->  Sort  (cost=153823.03..154234.66 rows=823260 width=2091) (actual time=162681.159..162681.169 rows=12 loops=1)
         Sort Key: itunes_release_date, rating_count
         Sort Method: top-N heapsort  Memory: 48kB
         ->  Seq Scan on apps  (cost=0.00..150048.41 rows=823260 width=2091) (actual time=0.718..161561.149 rows=808554 loops=1)
               Filter: (kind = 'software'::text)
 Total runtime: 162682.143 ms
(7 rows)

那么，我该如何优化呢？ PG版本是9.2.4，FWIW。

kind和kind, itunes_release_date已有索引。

Answer 1

看起来你错过了一个索引，例如在(kind, itunes_release_date desc, rating_count desc)。

Answer 2

apps表有多大？你有至少这么多内存分配给postgres？如果每次都必须从磁盘读取，查询速度会慢得多。

另一件可能有用的事情是将表格集中在“应用”列上。这可以加快磁盘访问速度，因为所有software行将按顺序存储在磁盘上。

Answer 3

加速此查询的唯一方法是在(itunes_release_date, rating_count)上创建复合索引。它将允许Postgres直接从索引中选择前N行。

如何在Postgres中优化此SQL查询？

3 个答案: