Question

当我在数据库中检查SHOW PROCESSLIST;时，我得到了以下查询。它大量使用CPU（超过100％），完成查询需要80秒。我们有一个单独的数据库服务器（64GB RAM）。

INSERT INTO `search_tmp_598075de5c7e67_73335919` 
SELECT `main_select`.`entity_id`, MAX(score) AS `relevance` 
  FROM (SELECT `search_index`.`entity_id`, (((0)) * 1) AS score  
        FROM `catalogsearch_fulltext_scope1` AS `search_index`
        LEFT JOIN `catalog_eav_attribute` AS `cea`
                    ON search_index.attribute_id = cea.attribute_id
        LEFT JOIN `catalog_category_product_index` AS `category_ids_index` 
                    ON search_index.entity_id = category_ids_index.product_id
        LEFT JOIN `review_entity_summary` AS `rating`
                    ON `rating`.`entity_pk_value`=`search_index`.entity_id
                   AND `rating`.entity_type = 1
                   AND `rating`.store_id  =  1
       WHERE (category_ids_index.category_id = 2299)
  ) AS `main_select`
 GROUP BY `entity_id`
 ORDER BY `relevance` DESC
 LIMIT 10000

为什么此查询使用我的完整CPU资源？

Answer 1

一些低效率：

外部联接catalog_category_product_index的记录存在非空条件。这会将外连接转换为内连接。使用inner join子句会更有效。
不需要嵌套查询：分组，排序和限制可以直接在内部查询上完成。
(((0)) * 1)只是一种说0的复杂方式，而MAX的{{1}}显然仍会为所有记录返回0的相关性。这不仅是输出0的低效方式，它也没有意义。我假设您的真实查询在那里有一些不太明显的计算，可能需要优化。
如果catalog_eav_attribute.attribute_id是一个唯一字段，那么在外部加入该表时没有任何意义，因为该数据不会在任何地方使用
如果review_entity_summary.entity_pk_value是唯一的（至少在entity_type = 1和store_id = 1时），那么再次在外部加入该表时没有用，因为该数据不会在任何地方使用
如果上述2个项目符号点中的字段不唯一，但每search_index.entity_id个值返回的记录数不会影响结果（因为它目前与晦涩的{{1}一致} value，它没有），那么也不需要外连接。

根据这些假设，(((0)) * 1)部分可以简化为：

select

我仍然将SELECT search_index.entity_id, MAX(((0)) * 1) AS relevance FROM catalogsearch_fulltext_scope1 AS search_index INNER JOIN catalog_category_product_index AS category_ids_index ON search_index.entity_id = category_ids_index.product_id WHERE category_ids_index.category_id = 2299 GROUP BY search_index.entity_id ORDER BY relevance DESC LIMIT 10000留在那里，但它确实毫无意义。

有很多左连接的慢查询

1 个答案: