优化派生表上的联接-在本地和服务器上的解释不同

时间:2019-02-27 00:10:36

标签: mysql join optimization subquery database-administration

我在本地计算机上有以下难看的查询,运行不错,但效果不佳(1.4秒,运行v5.7)。在我使用的服务器上,该服务器运行的是旧版本的MySQL(v5.5),查询只是挂起。似乎在“复制到tmp表”中被捕获了:

SELECT
  SQL_CALC_FOUND_ROWS
  DISTINCT p.parcel_number,
  p.street_number,
  p.street_name,
  p.site_address_city_state,
  p.number_of_units,
  p.number_of_stories,
  p.bedrooms,
  p.bathrooms,
  p.lot_area_sqft,
  p.cost_per_sq_ft,
  p.year_built,
  p.sales_date,
  p.sales_price,
  p.id
  FROM (
    SELECT APN, property_case_detail_id FROM property_inspection AS pi
      GROUP BY APN, property_case_detail_id
      HAVING 
      COUNT(IF(status='Resolved Date', 1, NULL)) = 0
    ) as open_cases
  JOIN property AS p
  ON p.parcel_number = open_cases.APN
  LIMIT 0, 1000;

mysql> show processlist;
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| Id    | User        | Host      | db           | Command | Time | State                | Info                                                                                                 |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| 21120 | headsupcity | localhost | lead_housing | Query   |   21 | Copying to tmp table | SELECT
          SQL_CALC_FOUND_ROWS
          DISTINCT p.parcel_number,
          p.street_numbe |
| 21121 | headsupcity | localhost | lead_housing | Query   |    0 | NULL                 | show processlist                                                                                     |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

我的本​​地计算机和服务器上的解释不同,并且我假设我的查询完全在我的本地计算机上运行的唯一原因是由于在派生表上自动创建的键:

说明(本地):

+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key         | key_len | ref                          | rows    | filtered | Extra                           |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
|  1 | PRIMARY     | p          | NULL       | ALL  | NULL          | NULL        | NULL    | NULL                         |   40319 |   100.00 | Using temporary                 |
|  1 | PRIMARY     | <derived2> | NULL       | ref  | <auto_key0>   | <auto_key0> | 8       | lead_housing.p.parcel_number |      40 |   100.00 | NULL                            |
|  2 | DERIVED     | pi         | NULL       | ALL  | NULL          | NULL        | NULL    | NULL                         | 1623978 |   100.00 | Using temporary; Using filesort |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+

说明(服务器):

+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows    | Extra                                    |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
|  1 | PRIMARY     | p          | ALL  | NULL          | NULL | NULL    | NULL |   41369 | Using temporary                          |
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |  122948 | Using where; Distinct; Using join buffer |
|  2 | DERIVED     | pi         | ALL  | NULL          | NULL | NULL    | NULL | 1718586 | Using temporary; Using filesort          |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+

模式:

mysql> explain property_inspection;
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field                   | Type         | Null | Key | Default           | Extra                       |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| id                      | int(11)      | NO   | PRI | NULL              | auto_increment              |
| lblCaseNo               | int(11)      | NO   | MUL | NULL              |                             |
| APN                     | bigint(10)   | NO   | MUL | NULL              |                             |
| date                    | varchar(50)  | NO   |     | NULL              |                             |
| status                  | varchar(500) | NO   |     | NULL              |                             |
| property_case_detail_id | int(11)      | YES  | MUL | NULL              |                             |
| case_type_id            | int(11)      | YES  | MUL | NULL              |                             |
| date_modified           | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| update_status           | tinyint(1)   | YES  |     | 1                 |                             |
| created_date            | datetime     | NO   |     | NULL              |                             |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
10 rows in set (0.02 sec)

mysql> explain property; (not all columns, but you get the gist)
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field                      | Type         | Null | Key | Default           | Extra                       |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| id                         | int(11)      | NO   | PRI | NULL              | auto_increment              |
| parcel_number              | bigint(10)   | NO   |     | 0                  |                             |
| date_modified              | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| created_date               | datetime     | NO   |     | NULL              |                             |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+

可能相关的变量:

tmp_table_size: 16777216
innodb_buffer_pool_size: 8589934592

关于如何优化此方法的任何想法,以及为什么解释如此不同的任何想法?

2 个答案:

答案 0 :(得分:1)

MySQL 5.5和5.7完全不同,而后者具有更好的优化程序,因此解释计划也就不同也就不足为奇了。

您最好提供SHOW CREATE TABLE property;SHOW CREATE TABLE property_inspection;输出,因为它将显示表上的索引。

您的子查询就是问题所在。 -服务器尝试处理没有索引的160万行并将所有内容分组。 -Having操作非常昂贵,因此最好避免使用它,尤其是在子查询中。 -在这种情况下分组是个坏主意。您不需要汇总/计数。您需要检查“确定日期”状态是否刚刚存在

根据所提供的信息,我建议: -更改表property_inspection以减少status列的长度。 -在列上添加索引。尽可能使用覆盖索引(APNproperty_case_detail_idstatus)(按此列顺序)。 -将查询更改为以下内容:

SELECT
    SQL_CALC_FOUND_ROWS
    DISTINCT p.parcel_number,
    ...
    p.id
FROM
    property_inspection AS `pi1`
    INNER JOIN property AS p ON (
        p.parcel_number = `pi1`.APN
    )
    LEFT JOIN (
        SELECT
              `pi2`.property_case_detail_id
            , `pi2`. APN
        FROM
            property_inspection AS `pi2`
        WHERE
            `status` = 'Resolved Date'
    ) AS exclude ON (
        exclude.APN = `pi1`.APN
        AND exclude.property_case_detail_id = `pi1`.property_case_detail_id
    )
WHERE
    exclude.APN IS NULL
LIMIT
    0, 1000;

答案 1 :(得分:1)

由于这是优化工具的不同之处,所以我们尝试进行优化

SELECT APN, property_case_detail_id FROM property_inspection AS pi
  GROUP BY APN, property_case_detail_id
  HAVING 
  COUNT(IF(status='Resolved Date', 1, NULL)) = 0
) as open_cases

尝试一下:

SELECT ...
    FROM property AS p
    WHERE NOT EXISTS ( SELECT 1 FROM property_inspection
                 WHERE status = 'Resolved Date'
                   AND p.parcel_number = APN )
    ORDER BY ???  -- without this, the `LIMIT` is unpredictable
    LIMIT 0, 1000;

或...

SELECT ...
    FROM property AS p
    LEFT JOIN  property_inspection AS pi  ON p.parcel_number = pi.APN
    WHERE pi.status = 'Resolved Date'
      AND pi.APN IS NULL
    ORDER BY ???  -- without this, the `LIMIT` is unpredictable
    LIMIT 0, 1000;

索引:

property_inspection:  INDEX(status, parcel_number) -- in either order