Question

我们在同一个表上有2个相同的（双精度）列，其中2个相同的索引运行2个相同的查询。然而一个比另一个快了近10倍。是什么导致了这个？

1) SELECT MIN("reports"."longitude") AS min_id FROM "reports" WHERE (area2 = 18)

2) SELECT MIN("reports"."latitude") AS min_id FROM "reports" WHERE (area2 = 18)

在28毫秒内运行1次，在> 300毫秒

中运行2次

以下是'解释'：
1）

Result  (cost=6.07..6.08 rows=1 width=0)"
InitPlan 1 (returns $0)"
  ->  Limit  (cost=0.00..6.07 rows=1 width=8)"
      ->  Index Scan using longitude on reports  (cost=0.00..139617.49 rows=22983 width=8)"
            Index Cond: (longitude IS NOT NULL)"
            Filter: (area2 = 18)"

2）

Result  (cost=5.95..5.96 rows=1 width=0)"
InitPlan 1 (returns $0)"
  ->  Limit  (cost=0.00..5.95 rows=1 width=8)"
      ->  Index Scan using latitude on reports  (cost=0.00..136754.07 rows=22983 width=8)"
            Index Cond: (latitude IS NOT NULL)"
            Filter: (area2 = 18)"

这里要求的是解释分析输出......

1）

Result  (cost=6.07..6.08 rows=1 width=0) (actual time=10.992..10.993 rows=1 loops=1)"
InitPlan 1 (returns $0)"
    ->  Limit  (cost=0.00..6.07 rows=1 width=8) (actual time=10.985..10.986 rows=1 loops=1)"
          ->  Index Scan using longitude on reports  (cost=0.00..139617.49 rows=22983 width=8) (actual time=10.983..10.983 rows=1 loops=1)"
                Index Cond: (longitude IS NOT NULL)"
                Filter: (area2 = 18)"
Total runtime: 11.033 ms"

2）

 Result  (cost=5.95..5.96 rows=1 width=0) (actual time=259.749..259.749 rows=1 loops=1)"
InitPlan 1 (returns $0)"
    ->  Limit  (cost=0.00..5.95 rows=1 width=8) (actual time=259.740..259.740 rows=1 loops=1)"
          ->  Index Scan using latitude on reports  (cost=0.00..136754.07 rows=22983 width=8) (actual time=259.739..259.739 rows=1 loops=1)"
                Index Cond: (latitude IS NOT NULL)"
                Filter: (area2 = 18)"
Total runtime: 259.789 ms"
---------------------

发生了什么事？如何让第二个查询正常运行并快速运行？据我所知，两种设置都是相同的。

Answer 1

首先，无法保证索引可以加快查询速度。其次，在执行性能考虑时，您需要多次运行每个查询。加载索引和将页面加载到缓存中可能会影响查询的长度。

我不是Postgres的专家，但考虑到这一点，我并不感到惊讶。

查询计划循环遍历索引，找到匹配area2 = 18的相应行，然后希望在第一行停止（它正在使用索引，因此它可以从最低值开始向上移动）。这是对它如何运作的猜测;我不知道Postgres是这样做的。

在任何情况下，发生的事情是该区域比经度指数的起点更接近经度指数的开头。因此，它首先找到第一个匹配记录。如果这个解释是正确的，那么与数据库中的其他东西相比，它会表明该区域相对较西（较低经度）和相对较北（较高纬度）。

顺便说一句，假设有很多区域，使用Area2上的索引可能会得到更好的结果。

Answer 2

您正在进行索引扫描，但检查的记录数取决于您必须在多远的距离上匹配area2条件。

除非您的area2分布很奇怪，否则要优化此查询，您应该在(area2, latitude)和(area2, longitude)上放置复合索引。我怀疑你会得到<10毫秒。 PG也可以使用其位图堆扫描功能，将area2上的单独索引与现有索引结合起来，代替复合索引。

PostgreSQL索引查询速度不一致

2 个答案: