驯服怪物MySQL查询

时间:2012-10-12 11:27:17

标签: mysql cakephp

因此,在SO用户的一些帮助下,我最终得到了一个逻辑上正确的MySQL查询,用于我正在处理的任务:检索一个反向按时间顺序排列的id列表,用于允许的新闻项目用户,将某些类型的分组项目过滤到该组的单个代表。 (唷!)

显而易见的问题是这个查询非常笨拙和缓慢 - 根据CakePHP调试打印输出的数据库调用,调整为145000毫秒,哎哟。

是否有一种合理的方法来驯服像这样的野兽,或者我应该承认我咬的比我在这里咀嚼的多,并且寻找一种不那么笨重的方法来获得或多或少相似的结果?所有建议都表示赞赏。

    SELECT DISTINCT Uid.id, Uid.type
    FROM (SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date,
                 uids.type type
          FROM uids 
          JOIN uids_uids ON uids_uids.uid_id = uids.id
          JOIN aros_uids ON uids.id = aros_uids.uid_id
          JOIN uids_uids ParentUids ON uids_uids.parent_id = ParentUids.uid_id
          WHERE uids.type IN ('Document','Photo','Release','PreRelease',
                              'ArtworkResource','Event') 
            AND (uids.start_date IS NULL OR uids.start_date <= NOW())
            AND (uids.end_date IS NULL OR uids.end_date <= NOW())
            AND aros_uids.aro_id IN (3,2,86,1448)
          ) Uid
    JOIN (SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate
          FROM uids JOIN uids_uids
          ON uids_uids.uid_id = uids.id
          GROUP BY uids_uids.parent_id, uids.type) T2
    ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
    ORDER BY Uid.date DESC
    LIMIT 100

ETA:

好的,作为第一遍,我将这些子选项转换为视图,所以现在查询看起来更易于管理

    SELECT DISTINCT Uid.id, Uid.type
    FROM UidView Uid
    JOIN UidView2 T2
    ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
    WHERE Uid.aro_id IN (3,2,86,1448)
    ORDER BY Uid.date DESC
    LIMIT 100

这肯定有帮助,将Cake的估计查询时间从6位数减少到2500位左右。绝对是一个好的开始!

1 个答案:

答案 0 :(得分:0)

以下是我要尝试的内容:

获取每个派生的查询并分别对每个查询运行EXPLAIN。正如评论所示,检查缺少索引的任何行,并在需要时添加。发布您的EXPLAIN结果以获取任何帮助。所以

EXPLAIN SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date, ....
EXPLAIN SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate ....

如果添加索引没有帮助或帮助很多,那么首先将每个子查询放入临时表并对其应用索引:

CREATE TABLE temp_uid
SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date,
             uids.type type
      FROM uids 
      JOIN uids_uids ON uids_uids.uid_id = uids.id
      JOIN aros_uids ON uids.id = aros_uids.uid_id
      JOIN uids_uids ParentUids ON uids_uids.parent_id = ParentUids.uid_id
      WHERE uids.type IN ('Document','Photo','Release','PreRelease',
                          'ArtworkResource','Event') 
        AND (uids.start_date IS NULL OR uids.start_date <= NOW())
        AND (uids.end_date IS NULL OR uids.end_date <= NOW())
        AND aros_uids.aro_id IN (3,2,86,1448);

CREATE TABLE temp_t2
SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate
      FROM uids JOIN uids_uids
      ON uids_uids.uid_id = uids.id
      GROUP BY uids_uids.parent_id, uids.type;

这些表上的JOIN

SELECT DISTINCT Uid.id, Uid.type
FROM temp_uid AS Uid
JOIN temp_t2 AS T2 ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
ORDER BY Uid.date DESC
LIMIT 100;

正如我所提到的,你可能需要添加索引,可能还要添加到临时表中的这些列:

ALTER TABLE temp_uid ADD INDEX parentDateIdx (parent_id, Uid.date);
ALTER TABLE temp_t2 ADD INDEX parentMaxDateIdx (parent_id, maxdate);

如果您需要刷新临时表,只需截断它们并对它们执行INSERT INTO temp_uid...SELECTINSERT INTO temp_t2...SELECT,而不是CREATE...SELECT。存储过程非常适用于此。

btw,执行CREATE TABLE temp_t2...SELECT,就像我为每个临时表所做的那样,可能无法创建最佳的表结构,所以最好在之后修改创建或从头开始自己动手。