如何使用正确的索引加快这个慢查询?

时间:2014-02-27 01:44:51

标签: sql postgresql

SELECT "items".* FROM "items" 
INNER JOIN item_mods ON item_mods.item_id = items.id 
INNER JOIN mods ON mods.id = item_mods.mod_id 
AND item_mods.mod_id = 3 
WHERE (items.player_id = '1') 
GROUP BY items.id, item_mods.primary_value 
ORDER BY item_mods.primary_value DESC NULLS LAST, items.created_at DESC LIMIT 100

此查询目前大约需要7秒钟。我在items表上有大约550k的记录,在item_mods表上有大约250万条记录,在mods表上大约有800条记录。我有很多索引,但我不确定我是否使用正确的索引。

因此,如果您要优化此查询,您会推荐什么?

以下是解释分析。

http://explain.depesz.com/s/aiYH

"Limit  (cost=107274.88..107275.13 rows=100 width=554) (actual time=6648.872..6648.888 rows=100 loops=1)"
"  ->  Sort  (cost=107274.88..107419.24 rows=57745 width=554) (actual time=6648.870..6648.879 rows=100 loops=1)"
"        Sort Key: item_mods.primary_value, items.created_at"
"        Sort Method: top-N heapsort  Memory: 103kB"
"        ->  Group  (cost=104634.82..105067.91 rows=57745 width=554) (actual time=6358.348..6529.342 rows=57498 loops=1)"
"              ->  Sort  (cost=104634.82..104779.18 rows=57745 width=554) (actual time=6358.344..6423.184 rows=57498 loops=1)"
"                    Sort Key: items.id, item_mods.primary_value"
"                    Sort Method: external sort  Disk: 25624kB"
"                    ->  Nested Loop  (cost=23182.35..71248.94 rows=57745 width=554) (actual time=3339.625..6127.659 rows=57498 loops=1)"
"                          ->  Index Scan using mods_pkey on mods  (cost=0.00..8.27 rows=1 width=4) (actual time=0.323..0.324 rows=1 loops=1)"
"                                Index Cond: (id = 3)"
"                          ->  Merge Join  (cost=23182.35..70663.22 rows=57745 width=558) (actual time=3339.298..6108.202 rows=57498 loops=1)"
"                                Merge Cond: (items.id = item_mods.item_id)"
"                                ->  Index Scan using items_pkey on items  (cost=0.00..45112.64 rows=543004 width=550) (actual time=3.190..2575.715 rows=543024 loops=1)"
"                                      Filter: (player_id = 1)"
"                                ->  Materialize  (cost=23182.33..23471.20 rows=57774 width=12) (actual time=3336.099..3388.810 rows=57547 loops=1)"
"                                      ->  Sort  (cost=23182.33..23326.76 rows=57774 width=12) (actual time=3336.095..3370.179 rows=57547 loops=1)"
"                                            Sort Key: item_mods.item_id"
"                                            Sort Method: external sort  Disk: 1240kB"
"                                            ->  Bitmap Heap Scan on item_mods  (cost=1084.27..17622.45 rows=57774 width=12) (actual time=31.728..3263.762 rows=57547 loops=1)"
"                                                  Recheck Cond: (mod_id = 3)"
"                                                  ->  Bitmap Index Scan on primary_value_mod_id_desc  (cost=0.00..1069.83 rows=57774 width=0) (actual time=29.565..29.565 rows=57547 loops=1)"
"                                                        Index Cond: (mod_id = 3)"
"Total runtime: 6652.100 ms"

更新

我已根据建议修改了查询。我使用GROUP BY只为每件商品ID选择1件商品,但我认为不同也适用。这是新的查询和解释,它仍然需要很长时间。查询的想法是找到玩家'1'的所有项目,其中项目修饰符'3'由具有最高主值的修饰符排序。

SELECT DISTINCT("items".id), "item_mods".primary_value, "items".created_at 
FROM "items" INNER JOIN item_mods ON item_mods.item_id = items.id 
INNER JOIN mods ON mods.id = item_mods.mod_id AND item_mods.mod_id = 3 
WHERE (items.player_id = '1') 
ORDER BY item_mods.primary_value DESC NULLS LAST, items.created_at DESC LIMIT 100

解释http://explain.depesz.com/s/t4Zq

"Limit  (cost=73737.59..73738.59 rows=100 width=16) (actual time=6450.253..6450.344 rows=100 loops=1)"
"  ->  Unique  (cost=73737.59..74315.04 rows=57745 width=16) (actual time=6450.248..6450.316 rows=100 loops=1)"
"        ->  Sort  (cost=73737.59..73881.95 rows=57745 width=16) (actual time=6450.242..6450.272 rows=100 loops=1)"
"              Sort Key: item_mods.primary_value, items.created_at, items.id"
"              Sort Method: external merge  Disk: 1456kB"
"              ->  Hash Join  (cost=46944.77..68183.71 rows=57745 width=16) (actual time=3018.769..6342.109 rows=57498 loops=1)"
"                    Hash Cond: (item_mods.item_id = items.id)"
"                    ->  Nested Loop  (cost=1084.27..18208.45 rows=57774 width=8) (actual time=15.911..3219.086 rows=57547 loops=1)"
"                          ->  Index Scan using mods_pkey on mods  (cost=0.00..8.27 rows=1 width=4) (actual time=0.486..0.489 rows=1 loops=1)"
"                                Index Cond: (id = 3)"
"                          ->  Bitmap Heap Scan on item_mods  (cost=1084.27..17622.45 rows=57774 width=12) (actual time=15.416..3197.257 rows=57547 loops=1)"
"                                Recheck Cond: (mod_id = 3)"
"                                ->  Bitmap Index Scan on primary_value_mod_id_desc  (cost=0.00..1069.83 rows=57774 width=0) (actual time=13.517..13.517 rows=57547 loops=1)"
"                                      Index Cond: (mod_id = 3)"
"                    ->  Hash  (cost=36420.95..36420.95 rows=543004 width=12) (actual time=2987.089..2987.089 rows=543024 loops=1)"
"                          Buckets: 4096  Batches: 32  Memory Usage: 811kB"
"                          ->  Seq Scan on items  (cost=0.00..36420.95 rows=543004 width=12) (actual time=0.012..2825.650 rows=543024 loops=1)"
"                                Filter: (player_id = 1)"
"Total runtime: 6457.586 ms"

更新2

好吧,我想我差不多了。 这个查询花了6秒钟并产生了我想要的东西

SELECT "items".id, item_mods.primary_value
FROM "items" 
INNER JOIN item_mods ON item_mods.item_id = items.id AND item_mods.mod_id = 36 
WHERE (items.player_id = '1') 
ORDER BY item_mods.primary_value DESC, item_mods.id DESC
LIMIT 100

但这个查询需要9ms!注意ORDER BY的区别。但我需要他们按最近的顺序排序。我有一个索引(item_mods.primary_value DESC,item_mods.id DESC),但似乎没有使用它?

SELECT "items".id, item_mods.primary_value
FROM "items" 
INNER JOIN item_mods ON item_mods.item_id = items.id AND item_mods.mod_id = 36 
WHERE (items.player_id = '1') 
ORDER BY item_mods.primary_value DESC
LIMIT 100

3 个答案:

答案 0 :(得分:1)

我假设您正在使用Postgres“功能”,您可以按表中的主/唯一键进行分组,然后从该表中选择所有列。否则,select *在聚合查询中没有意义。

SELECT "items".*
FROM "items"  INNER JOIN
     item_mods
     ON item_mods.item_id = items.id INNER JOIN
     mods
     ON mods.id = item_mods.mod_id AND item_mods.mod_id = 3 
WHERE (items.player_id = '1') 
GROUP BY items.id, item_mods.primary_value 
ORDER BY item_mods.primary_value DESC NULLS LAST, items.created_at DESC
LIMIT 100;

以下索引应该有助于此查询:

items(player_id, id)
item_mods(item_id, mod_id);
mods(id);

答案 1 :(得分:0)

我通过将index(mod_id,primary_value desc,id desc)添加到item_mods表来修复它。该查询现在运行10-15ms

答案 2 :(得分:0)

使用复合索引。

什么是索引?

索引是表中数据的特殊信息块,需要在包含该索引的表的每次更新时进行更新,这意味着如果您不断更新索引表索引的数据会对性能产生负面影响

积极的一面是减少搜索/排序/组时间。

什么是综合指数? 复合索引是一个特殊的信息块,可以视为排序数组,其中行包含由键组成的所有列的值的串联所产生的数据。复合键只包含单个表的列(MySQL,对其他表不确定!),它可以加速针对单个表进行的多种查询。

索引的潜在候选人(列)是什么? 那些用于搜索(选择),分组和排序(顺序)。

有没有办法强制/忽略索引使用? 是。 (MySQL的!)

什么是指数的潜在候选人?怎么找到呢? 来自查询的列,其性能被视为 - 慢。