巨大的处理时间差异取决于轻微的查询参数差异

时间:2018-05-29 11:40:55

标签: ruby-on-rails postgresql activerecord

我有两个模型,PartMaster和Location,具有一对多的关系。我必须使用两个表的左连接在字段part_masters.combo和locations.ubicacion上执行搜索。

我的问题是,相同的查询处理时间完全不同,具体取决于查询的参数。

此查询的执行时间约为 450 ms

查询计划是这个

plan for the query looking for 'P0'

SELECT DISTINCT "part_masters".* FROM "part_masters" LEFT OUTER JOIN 
"locations" ON "locations"."sap_cod" = "part_masters"."sap_cod" WHERE 
(unaccent(locations.ubicacion) ILIKE unaccent('%P0%')) AND 
(unaccent(part_masters.combo) ILIKE unaccent('%junta%')) AND 
(unaccent(part_masters.combo) ILIKE unaccent('%torica%')) ORDER BY 
"part_masters"."sap_cod" ASC

另一个查询,只是将'P01'更改为'P01'作为locations.ubicacion的查询参数需要 38秒才能执行,我大部分时间都得到了提示。

plan for the query looking for 'P01'

SELECT DISTINCT "part_masters".* FROM "part_masters" LEFT OUTER JOIN 
"locations" ON "locations"."sap_cod" = "part_masters"."sap_cod" WHERE 
(unaccent(locations.ubicacion) ILIKE unaccent('%P01%')) AND 
(unaccent(part_masters.combo) ILIKE unaccent('%junta%')) AND 
(unaccent(part_masters.combo) ILIKE unaccent('%torica%')) ORDER BY 
"part_masters"."sap_cod" ASC

分析输出:

Unique  (cost=3880.72..3880.77 rows=1 width=242) (actual 
time=39902.298..39902.305 rows=8 loops=1)
->  Sort  (cost=3880.72..3880.73 rows=1 width=242) (actual 
time=39902.297..39902.297 rows=8 loops=1)
    Sort Key: part_masters.sap_cod, part_masters.id, 
part_masters.descripcion_maestro, part_masters.ref_fabricante, 
part_masters.fabricante, part_masters.stock, part_masters.precio_medio, 
part_masters.planta_cod, part_masters.planta_nombre, 
part_masters.unidad_medida, part_masters.grupo_compras, 
part_masters.created_at, part_masters.updated_at, 
part_masters.combinada_maestro, part_masters.precio_estandar, 
part_masters.fabricante_nombre, part_masters.combo
    Sort Method: quicksort  Memory: 29kB
    ->  Nested Loop  (cost=0.00..3880.71 rows=1 width=242) (actual 
time=10393.015..39902.250 rows=8 loops=1)
          Join Filter: ((part_masters.sap_cod)::text = 
(locations.sap_cod)::text)
          Rows Removed by Join Filter: 438318
          ->  Seq Scan on part_masters  (cost=0.00..2451.75 rows=1 
 width=242) (actual time=2.135..315.211 rows=262 loops=1)
                Filter: ((unaccent(combo) ~~* unaccent('%junta%'::text)) AND 
 (unaccent(combo) ~~* unaccent('%torica%'::text)))
                Rows Removed by Filter: 38408
          ->  Seq Scan on locations  (cost=0.00..1409.24 rows=1578 width=5) 
 (actual time=0.107..148.671 rows=1673 loops=262)
                Filter: (unaccent((ubicacion)::text) ~~* 
 unaccent('%P01%'::text))
                Rows Removed by Filter: 37586
 Total runtime: 39902.358 ms

我在组合中有这个索引,没有被使用。

part_masters_on_combo_idx UNUSED(三元组索引)

location.ubicacion没有任何索引。

根据我链接的计划,实际循环数和共享命中数字块数存在巨大差异,我不知道是否有帮助

1 个答案:

答案 0 :(得分:1)

您需要索引以便PostgreSQL可以加速查询并计算更好的估算值:

CREATE INDEX ON part_masters (lower(unaccent(combo)) text_pattern_ops);
CREATE INDEX ON locations (lower(unaccent(ubicacion)) text_pattern_ops);

然后在两个表上运行ANALYZE

此外,您还必须重写以下三个条件:

unaccent(part_masters.combo) ILIKE unaccent('%junta%')
像这样:

lower(unaccent(part_masters.combo)) LIKE lower(unaccent('%junta%'))

这应该可以让你获得显着的性能提升。