为什么索引ORDER BY查询匹配许多行比仅匹配少数几行的查询快得多?

时间:2016-08-22 16:33:09

标签: performance postgresql indexing postgresql-9.3

好的,我有以下问题:

explain analyze SELECT seller_region FROM "products" 
  WHERE "products"."seller_region" = 'Bremen'
    AND "products"."state" = 'active' 
  ORDER BY products.rank DESC, 
    products.score ASC NULLS LAST, 
    GREATEST(products.created_at, products.price_last_updated_at) DESC 
  LIMIT 14 OFFSET 0

查询过滤与11.000 rows匹配。如果我们查看查询规划器,我们可以看到查询使用索引index_products_active_for_default_order并且非常快:

Limit  (cost=0.43..9767.16 rows=14 width=36) (actual time=1.576..6.711 rows=14 loops=1)
  ->  Index Scan using index_products_active_for_default_order on products  (cost=0.43..4951034.14 rows=7097 width=36) (actual time=1.576..6.709 rows=14 loops=1)
        Filter: ((seller_region)::text = 'Bremen'::text)
        Rows Removed by Filter: 3525
Total runtime: 6.724 ms

现在,如果我在查询中将'Bremen'替换为'Sachsen',那么:

explain analyze SELECT seller_region FROM "products" 
  WHERE "products"."seller_region" = 'Sachsen'
    AND "products"."state" = 'active' 
  ORDER BY products.rank DESC, 
    products.score ASC NULLS LAST, 
    GREATEST(products.created_at, products.price_last_updated_at) DESC 
  LIMIT 14 OFFSET 0

相同的查询只匹配70 rows左右,现在一直非常慢,即使它以完全相同的方式使用相同的索引:

Limit  (cost=0.43..1755.00 rows=14 width=36) (actual time=2.498..1831.737 rows=14 loops=1)
  ->  Index Scan using index_products_active_for_default_order on products  (cost=0.43..4951034.14 rows=39505 width=36) (actual time=2.496..1831.727 rows=14 loops=1)
        Filter: ((seller_region)::text = 'Sachsen'::text)
        Rows Removed by Filter: 963360
Total runtime: 1831.760 ms

我不明白为什么会这样?我会凭直觉认为匹配更多行的查询会更慢,但反过来也是如此。我已经用我桌子上其他列的其他查询对此进行了测试,现象也是如此。两个类似的查询具有与上述相同的排序,使得匹配更多行的那些查询比过滤仅匹配少数行的那些更快100倍。为什么会这样,我怎么能避免这种行为?

PS:我使用的是postgres 9.3,索引定义如下:

CREATE INDEX index_products_active_for_default_order
  ON products
  USING btree
  (rank DESC, score COLLATE pg_catalog."default", (GREATEST(created_at, price_last_updated_at)) DESC)
  WHERE state::text = 'active'::text;

1 个答案:

答案 0 :(得分:0)

这是因为不莱门的前14个匹配行在前3539个索引行中找到,而对于萨克森963374行必须扫描。

我推荐(seller_region, rank)上的索引。