如何优化复杂查询?

时间:2010-11-18 20:53:19

标签: sql mysql query-optimization greatest-n-per-group

我正在开发一种营销型系统。在首页上,其中一个要求是销售人员查看他们目前拥有的销售机会数量。

Birthdays     | 10
Anniversaries | 15
Introductions | 450
Recurring     | 249

问题是我UNION所有这些问题,在某些情况下查询占用了10多个。 (我们已经进行了缓存,因此这只是用户第一次登录当天的问题。)

还涉及许多其他标准:

    计数中包含的
  • 应该只是每个客户每种类型的最新一个(例如,如果客户有两个介绍,它应该只计算一次 - 我使用greatest-n-per-group方法来实现此目的)
  • 生日和纪念日,日期应为今天+/- 7天
  • 对于所有这些人,只计算过去60天内的记录
  • 这些记录需要与客户表联系,以确保机会的销售人员与客户的当前销售人员匹配

这是生成的查询(很长):

SELECT 'Birthdays' AS `type`, COUNT(*) AS `num` 
FROM `opportunities` 
INNER JOIN `customers` 
    ON `opportunities`.`customer_id` = `customers`.`customer_id` 
    AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id` 
LEFT JOIN `opportunities` AS `o2` 
    ON `opportunities`.`customer_id` = `o2`.`customer_id` 
    AND `opportunities`.`marketing_message` = `o2`.`marketing_message` 
    AND opportunities.communication_alert_date < o2.communication_alert_date 
WHERE ((`opportunities`.`org_code` = ?)) 
AND (opportunities.marketing_message = 'Birthday Alert') 
AND ((opportunities.communication_alert_date BETWEEN 
    DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY))) 
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 
AND (o2.customer_id IS NULL) 

UNION ALL 

SELECT 'Anniversaries' AS `type`, COUNT(*) AS `num` 
FROM `opportunities` 
INNER JOIN `customers` 
    ON `opportunities`.`customer_id` = `customers`.`customer_id` 
    AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id` 
LEFT JOIN `opportunities` AS `o2` 
    ON `opportunities`.`customer_id` = `o2`.`customer_id` 
    AND `opportunities`.`marketing_message` = `o2`.`marketing_message` 
    AND opportunities.communication_alert_date < o2.communication_alert_date 
WHERE ((`opportunities`.`org_code` = ?)) 
AND (opportunities.marketing_message = 'Anniversary Alert') 
AND ((opportunities.communication_alert_date BETWEEN 
    DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY))) 
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 
AND (o2.customer_id IS NULL) 

UNION ALL 

SELECT 'Introductions' AS `type`, COUNT(*) AS `num` 
FROM `opportunities` 
INNER JOIN `customers` 
    ON `opportunities`.`customer_id` = `customers`.`customer_id` 
    AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id` 
LEFT JOIN `opportunities` AS `o2` 
    ON `opportunities`.`customer_id` = `o2`.`customer_id` 
    AND `opportunities`.`marketing_message` = `o2`.`marketing_message` 
    AND opportunities.communication_alert_date < o2.communication_alert_date 
WHERE ((`opportunities`.`org_code` = ?)) 
AND ((opportunities.Intro_Letter = 'Yes')) 
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 
AND (o2.customer_id IS NULL) 

UNION ALL 

SELECT 'Recurring' AS `type`, COUNT(*) AS `num` 
FROM `opportunities` 
INNER JOIN `customers` 
    ON `opportunities`.`customer_id` = `customers`.`customer_id` 
    AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id` 
LEFT JOIN `opportunities` AS `o2` 
    ON `opportunities`.`customer_id` = `o2`.`customer_id` 
    AND `opportunities`.`marketing_message` = `o2`.`marketing_message` 
    AND opportunities.communication_alert_date < o2.communication_alert_date 
WHERE ((`opportunities`.`org_code` = ?)) 
AND ((opportunities.marketing_message != 'Anniversary Alert' 
AND opportunities.marketing_message != 'Birthday Alert' 
AND opportunities.Intro_Letter != 'Yes')) 
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 
AND (o2.customer_id IS NULL)

我在opportunities表中的以下字段中有索引:

  • org_code
  • CUSTOMER_ID
  • Intro_Letter
  • marketing_message
  • sales_person_id
  • org_code,marketing_message
  • org_code,Intro_Letter
  • org_code,marketing_message,Intro_Letter

任何帮助优化此功能都将非常感激。如果需要,我愿意创建其他表或视图。

6 个答案:

答案 0 :(得分:2)

一个好的起点是删除字符串比较并将它们放在一个带有指定ID的表中,并在

的位置添加数字列
opportunities.marketing_message != 'Birthday Alert'

所以你有......

[id]    [name]
1       Birthday Alert
2       Anniversary

即使使用索引,数值比较也总是快得多。这样做还可以让您在将来轻松添加新类型。

这部分是多余的,你不需要AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)),因为它之前的条款将完成这项工作。

AND ((opportunities.communication_alert_date BETWEEN 
    DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY))) 
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))

答案 1 :(得分:2)

我同意现有的评论,警报文本需要在类型表中,并且与OPPORTUNITIES表具有外键关系。

当你只需要一个查询时,将它留给Zend两个查询:

   SELECT CASE
            WHEN marketing_message = 'Birthday Alert' THEN 'Birthdays'
            WHEN marketing_message = 'Anniversary Alert' THEN 'Anniversaries'
          END AS msg,
          COUNT(*)
     FROM OPPORTUNITIES o
     JOIN CUSTOMERS c ON c.customer_id = o.customer_id
                 AND c.sales_person_id = o.sales_person_id
LEFT JOIN OPPORTUNITIES o2 ON o2.customer_id = o.customer_id
                      AND o2.marketing_message = o.marketing_message
                      AND o2.communication_alert_date < o.communication_alert_date
    WHERE o.org_code ?
      AND o.marketing_message IN ('Birthday Alert', 'Anniversary Alert') 
      AND o.communication_alert_date BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) 
                                         AND DATE_ADD(NOW(), INTERVAL 7 DAY)
      AND o.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)
      AND o2.customer_id IS NULL
 GROUP BY msg

答案 2 :(得分:0)

通过删除where子句中的所有分组括号,可以更容易阅读。这至少可以让你更容易看到最新情况和优化

答案 3 :(得分:0)

在每个子查询中:

LEFT JOIN `opportunities` AS `o2` 
    ON `opportunities`.`customer_id` = `o2`.`customer_id` 
...
AND (o2.customer_id IS NULL)

这意味着您只需要对customer_id具有NULL的opportunities o2。因为这些查询可以使用2个INNER连接而不是1个OUTER和1个INNER连接来编写,这可能更快。 像这样:

SELECT `o1`.`Birthdays` AS `type`, COUNT(*) AS `num` 
FROM `opportunities` as `o2`
INNER JOIN `opportunities` AS `o1` 
    ON `o1`.`marketing_message` = `o2`.`marketing_message` 
    AND o1.communication_alert_date < o2.communication_alert_date 
INNER JOIN `customers` 
    ON `o1`.`customer_id` = `customers`.`customer_id` 
    AND `o1`.`sales_person_id` = `customers`.`sales_person_id` 
WHERE (o2.customer_id IS NULL)
AND (o2.marketing_message = 'Birthday Alert') 
AND ((`o1`.`org_code` = ?)) 
AND ((o1.communication_alert_date BETWEEN 
    DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY))) 
AND (o1.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 

答案 4 :(得分:0)

除了提供的答案之外,我还用子查询替换LEFT JOIN,以便按类型返回最新的实例。这似乎有很大的帮助。

即(仅限生日和周年纪念计数):

SELECT 
    CASE
        WHEN marketing_message = 'Birthday Alert' THEN 'Birthdays'
        WHEN marketing_message = 'Anniversary Alert' THEN 'Anniversaries'
    END AS `type`, 
    COUNT(*) AS `num` 
FROM (
    SELECT `opp_sub`.* 
    FROM (
        SELECT `opportunities`.`marketing_message`, `opportunities`.`customer_id`
        FROM `opportunities`
        INNER JOIN `customers` 
            ON `opportunities`.`customer_id` = `customers`.`customer_id` 
            AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id` 
        WHERE (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)) 
        AND (`opportunities`.`dealer_code` = ?)
        AND (opportunities.marketing_message IN ('Anniversary Alert', 'Birthday Alert')) 
        AND (opportunities.communication_alert_date 
            BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) 
                AND DATE_ADD(NOW(), INTERVAL 7 DAY))
        ORDER BY `opportunities`.`communication_alert_date` DESC
    ) AS `wool_sub` 
    GROUP BY `customer_id`, `marketing_message`
) AS `c_table` 

答案 5 :(得分:0)

如果你看一下http://dev.mysql.com/doc/refman/5.0/en/using-explain.html 您将使用EXPLAIN关键字检查查询,为您提供有关查询执行方式的信息。然后你就可以看到性能差的确切位置。