如何优化此查询?

时间:2011-04-27 17:43:35

标签: mysql sql query-optimization

以下是查询:

SELECT p.name p_name,
c.name c_name,
p.line1,
p.zip,
c.line1,
p.zip
FROM

(SELECT c.name,
ad.line1,
ad.zip
FROM customer c
JOIN account a ON a.customer_id = c.id
JOIN account_address aa ON aa.account_id = a.id
JOIN address ad ON aa.address_id = ad.id
JOIN account_import ai ON a.account_import_id = ai.id
JOIN generic_import gi ON ai.generic_import_id = gi.id
JOIN import_bundle ib ON gi.import_bundle_id = ib.id
WHERE gi.active = 1
AND ib.active = 1
AND ib.bank_id = 8
LIMIT 1000) c

JOIN
(SELECT p.name,
a.line1,
a.zip
FROM prospect p
JOIN address a ON p.address_id = a.id) p
ON
0
OR (p.zip = c.zip AND SUBSTRING(p.name, 1, 12) = SUBSTRING(c.name, 1, 12))
OR (p.zip = c.zip AND p.name = c.name)
OR (p.zip = c.zip
  AND SUBSTRING(p.name, 1, 4) = SUBSTRING(c.name, 1, 4)
  AND SUBSTRING(SUBSTRING_INDEX(p.name, ' ', -1), 1, 4) = SUBSTRING(SUBSTRING_INDEX(c.name, ' ', -1), 1, 4))
OR (p.zip = c.zip
  AND SUBSTRING(p.name, 1, 3) = SUBSTRING(c.name, 1, 3)
  AND SUBSTRING(SUBSTRING_INDEX(p.name, ' ', -1), 1, 3) = SUBSTRING(SUBSTRING_INDEX(c.name, ' ', -1), 1, 3)
  AND SUBSTRING(p.line1, 1, 4) = SUBSTRING(c.line1, 1, 4))

这是EXPLAIN

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 15030
        Extra: Using where; Using join buffer
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: p
         type: ALL
possible_keys: address_id
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 15067
        Extra:
*************************** 4. row ***************************
           id: 3
  select_type: DERIVED
        table: a
         type: eq_ref
possible_keys: PRIMARY,index_address_id
          key: PRIMARY
      key_len: 8
          ref: mcif.p.address_id
         rows: 1
        Extra:
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: ib
         type: index_merge
possible_keys: PRIMARY,bank_id,fk_bank_id,index_import_bundle_id,index_import_bundle_bank_id,index_import_bundle_active
          key: index_import_bundle_active,fk_bank_id
      key_len: 1,8
          ref: NULL
         rows: 1
        Extra: Using intersect(index_import_bundle_active,fk_bank_id); Using where; Using index
*************************** 6. row ***************************
           id: 2
  select_type: DERIVED
        table: gi
         type: ref
possible_keys: PRIMARY,import_bundle_id,index_generic_import_id,index_generic_import_import_bundle_id,index_generic_import_active
          key: import_bundle_id
      key_len: 8
          ref: mcif.ib.id
         rows: 34
        Extra: Using where
*************************** 7. row ***************************
           id: 2
  select_type: DERIVED
        table: ai
         type: ref
possible_keys: PRIMARY,generic_import_id,index_account_import_generic_import_id
          key: generic_import_id
      key_len: 8
          ref: mcif.gi.id
         rows: 1
        Extra: Using index
*************************** 8. row ***************************
           id: 2
  select_type: DERIVED
        table: a
         type: ref
possible_keys: PRIMARY,fk_account_customer_id,index_account_customer_id,index_account_id,index_account_account_import_id
          key: index_account_account_import_id
      key_len: 9
          ref: mcif.ai.id
         rows: 1482
        Extra: Using where
*************************** 9. row ***************************
           id: 2
  select_type: DERIVED
        table: c
         type: eq_ref
possible_keys: PRIMARY,index_customer_id
          key: PRIMARY
      key_len: 8
          ref: mcif.a.customer_id
         rows: 1
        Extra:
*************************** 10. row ***************************
           id: 2
  select_type: DERIVED
        table: aa
         type: ref
possible_keys: fk_account_address_account_id,fk_account_address_address_id,index_account_address_account_id,index_account_address_address_id
          key: fk_account_address_account_id
      key_len: 8
          ref: mcif.a.id
         rows: 1
        Extra:
*************************** 11. row ***************************
           id: 2
  select_type: DERIVED
        table: ad
         type: eq_ref
possible_keys: PRIMARY,index_address_id
          key: PRIMARY
      key_len: 8
          ref: mcif.aa.address_id
         rows: 1
        Extra:
11 rows in set (0.10 sec)

我不知道从哪里开始。我想主要是我只需要有人来解释EXPLAIN

4 个答案:

答案 0 :(得分:1)

这些子字符串会破坏您使用索引的可能性,是否可以删除它们并加入全名?如果没有,你可以在包含名称子串的表中添加一个额外的索引列,然后加入这些列吗?

答案 1 :(得分:0)

数据库引擎会尝试尽快使用where子句删除行。但是,在您的第一个子选择中,在您加入表6表之前,不可能开始使用where子句删除行。您在where子句中引用的第一个表应始终是您选择的第一个表,而您在where子句中引用的第二个表应该是您加入的第一个表。

所以尝试在大内部选择中切换表的顺序:)

这是我的尝试,但当然未经测试,因为我没有数据库:

SELECT
  p.name AS p_name,
  c.name AS c_name,
  p.line1,
  p.zip,
  c.line1,
  p.zip
FROM (
  SELECT
    c.name,
    ad.line1,
    ad.zip
  FROM generic_import AS gi
  JOIN import_bundle AS ib ON gi.import_bundle_id = ib.id
  JOIN account_import AS ai ON ai.generic_import_id = gi.id
  JOIN account AS a ON a.account_import_id = ai.id
  JOIN account_address AS aa ON aa.account_id = a.id
  JOIN address AS ad ON aa.address_id = ad.id
  JOIN customer AS c ON a.customer_id = c.id
  WHERE gi.active = 1
    AND ib.active = 1
    AND ib.bank_id = 8
  LIMIT 1000) AS c
  JOIN 
    (SELECT 
      p.name,
      a.line1,
      a.zip
    FROM prospect AS p
    JOIN address AS a ON p.address_id = a.id
    ) AS p
    ON (p.zip = c.zip
    AND SUBSTRING(p.name, 1, 12) = SUBSTRING(c.name, 1, 12)
       )
    OR (p.zip = c.zip AND p.name = c.name)
    OR 
    (p.zip = c.zip AND SUBSTRING(p.name, 1, 4) = SUBSTRING(c.name, 1, 4)
      AND SUBSTRING(SUBSTRING_INDEX(p.name, ' ', -1), 1, 4) = 
        SUBSTRING(SUBSTRING_INDEX(c.name, ' ', -1), 1, 4)
    )
    OR 
    (p.zip = c.zip AND SUBSTRING(p.name, 1, 3) = SUBSTRING(c.name, 1, 3)
    AND SUBSTRING(SUBSTRING_INDEX(p.name, ' ', -1), 1, 3) = 
      SUBSTRING(SUBSTRING_INDEX(c.name, ' ', -1), 1, 3)
    AND SUBSTRING(p.line1, 1, 4) = SUBSTRING(c.line1, 1, 4)
    )

它更快吗? d:

答案 2 :(得分:0)

您可以尝试将prospect.name添加到prospect.address_id上的索引,这样您就可以在address_id,name上找到索引。它应该稍微改善性能(至少来自explain的第3行)。 也是行
OR (p.zip = c.zip AND p.name = c.name)似乎是多余的 -
 (p.zip = c.zip AND SUBSTRING(p.name, 1, 12) = SUBSTRING(c.name, 1, 12))包含p.name = c.name的所有行。

答案 3 :(得分:0)

解决方案是取出子查询并仅使用连接重写查询。