使用Case进行查询的效率更高的group by

时间:2012-11-29 20:32:58

标签: mysql performance group-by

我有以下查询构建记录集,该记录集在饼图中用作报告。

它不经常运行,但是当它运行时需要几秒钟,我想知道是否有任何方法可以提高效率。

SELECT
  CASE
    WHEN (lastStatus IS NULL)     THEN 'Unused'
    WHEN (attempts > 3 AND callbackAfter IS NULL)   THEN 'Max Attempts Reached'
    WHEN (callbackAfter IS NOT NULL AND callbackAfter >  DATE_ADD(NOW(), INTERVAL 7 DAY)) THEN 'Call Back After 7 Days'
    WHEN (callbackAfter IS NOT NULL AND callbackAfter <= DATE_ADD(NOW(), INTERVAL 7 DAY)) THEN 'Call Back Within 7 Days'
    WHEN (archived = 0)     THEN 'Call Back Within 7 Days'
    ELSE 'Spoke To'
  END AS statusSummary,
  COUNT(leadId) AS total
FROM
  CO_Lead
WHERE
  groupId = 123
  AND
  deleted = 0
GROUP BY
  statusSummary
ORDER BY
  total DESC;

我有(groupId, deleted)的索引,但我不确定将任何其他字段添加到索引中会有所帮助(如果可以的话,我该如何决定哪个字段应该先行?{ {1}}因为它使用最多?)

该表有大约500,000行(但从现在起每年将有10次。)

我能想到的唯一另一件事就是将它分成6个查询(WHEN子句移入WHERE),但这需要花费3倍的时间。

编辑:

这是表格定义

callbackAfter

2 个答案:

答案 0 :(得分:1)

尝试删除索引以查看是否可以提高性能。

在某些数据库中,索引不一定能提高性能。如果您有索引,MySQL将始终使用它。在这种情况下,这意味着它将读取索引,然后它将必须从每个页面读取数据。页面读取是随机的,而不是顺序的。对于必须读取所有页面的查询,此随机读取可以降低性能。

答案 1 :(得分:1)

注意:

  1. 如果leadId不能NULL,请将COUNT(leadId)更改为COUNT(*)。它们在逻辑上是等价的,但大多数版本的MySQL优化器并不那么聪明,无法识别它。
  2. 删除两个冗余的callbackAfter IS NOT NULL条件。如果callbackAfter满足第二部分,则无论如何都不能为空。
  3. 您可以将查询拆分为6个部分并为每个部分添加适当的索引 - 但是,根据CASE的条件是否重叠,您可能会得到错误或正确的结果。
  4. 可能的重写(注意不同的格式并检查是否返回相同的结果,它可能不会!)

    SELECT
        cnt1 AS "Unused"
      , cnt2 AS "Max Attempts Reached"
      , cnt3 AS "Call Back After 7 Days"
      , cnt4 AS "Call Back Within 7 Days"
      , cnt5 AS "Call Back Within 7 Days"
      , cnt6 - (cnt1+cnt2+cnt3+cnt4+cnt5) AS "Spoke To"
    FROM
      ( SELECT
          ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
              AND lastStatus IS NULL
          ) AS cnt1
        , ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
              AND attempts > 3 AND callbackAfter IS NULL
          ) AS cnt2
        , ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
              AND callbackAfter >  DATE_ADD(NOW(), INTERVAL 7 DAY)
          ) AS cnt3
        , ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
              AND callbackAfter <= DATE_ADD(NOW(), INTERVAL 7 DAY)
          ) AS cnt4
        , ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
              AND archived = 0
          ) AS cnt5
        , ( SELECT COUNT(*)  FROM CO_Lead
            WHERE groupId = 123 AND deleted = 0
          ) AS cnt6
      ) AS tmp ;
    

    如果它确实返回了正确的结果,您可以添加要用于每个子查询的索引:

    对于子查询1:(groupId, deleted, lastStatus)

    对于子查询2,3,4:(groupId, deleted, callbackAfter, attempts)

    对于子查询5:(groupId, deleted, archived)


    另一种方法是保留您的查询(仅注意上面的注释1和2)并添加一个宽覆盖索引:

     (groupId, deleted, lastStatus, callbackAfter, attempts, archived)