如何优化具有2个不同内连接的Mysql查询? (InnoDB)

时间:2015-05-11 09:43:00

标签: mysql sql performance

我有一个查询,我使用InnoDB存储引擎。

我想优化它。执行需要太多时间。我的数据库中有500万个数据。现在需要250秒才能执行。

INSERT INTO dynamicgroups (adressid) 

    SELECT SQL_NO_CACHE DISTINCT(addressid) FROM (
        SELECT cluster_0.addressid FROM (
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                               ON f_0.addressid = g.addressid
                        WHERE  (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH))
                            AND f_0.birthday < CURDATE() )
                ) group_all
            ) AS groups

        ) AS cluster_0

        INNER JOIN(
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                               ON f_0.addressid = g.addressid
                        WHERE  (marriage_date = ''
                             OR marriage_date = '1900-01-01'
                             OR marriage_date = '0000-00-00' )
                ) group_all
            ) AS groups
        ) AS cluster_1 ON cluster_1.addressid = cluster_0.addressid

        INNER JOIN(
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                                ON f_0.addressid = g.addressid
                        WHERE  (f_0.city = '34' )
                ) group_all
            ) AS groups
        ) AS cluster_2 ON cluster_2.addressid = cluster_1.addressid 
    ) AS t

3 个答案:

答案 0 :(得分:1)

即使EXPLAIN运算符没有像其他运算符一样实现..我建议您将它用于查询。

之后,您可以分析EXPLAIN提供的结果,并决定应将哪些列编入索引。

有关详细信息,我建议您查看这些来源:

MySQL syntax: EXPLAIN

MySQL using: EXPLAIN

此外,最后2个选项看起来非常相似,也许您可​​以制作临时表或其中的视图,这样您就不必运行整个选择两次了?

答案 1 :(得分:1)

您的查询似乎都是此查询的变体:

SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
     table2.emaildata f_0
     ON f_0.addressid = g.addressid
WHERE  (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() )

我建议您使用group byhaving

来解决这个问题
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
     table2.emaildata f_0
     ON f_0.addressid = g.addressid
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
       SUM(marriage_date = '' OR marriage_date = '1900-01-01'  OR marriage_date = '0000-00-00' ) > 0 AND
       SUM(f_0.city = '34' ) > 0;

根据数据量,group by之前的过滤也可以提供帮助:

SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
     table2.emaildata f_0
     ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) OR
      (marriage_date = ''  OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) OR
      (f_0.city = '34' )
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
       SUM(marriage_date = '' OR marriage_date = '1900-01-01'  OR marriage_date = '0000-00-00' ) > 0 AND
       SUM(f_0.city = '34' ) > 0;

答案 2 :(得分:0)

marriage_date - 将其设为NULLable并使用NULL而不是&#39;&#39;等等。这样可以避免效率低下OR可能导致INDEX的可用性。

请提供SHOW CREATE TABLE,以便我们评估当前的索引。

你在运行什么版本?直到非常,最近这个构造非常效率低下:

FROM ( SELECT ... )
JOIN ( SELECT ... )

解决方法是将子查询放入tmp表并添加INDEX

可能会帮助您,因为您似乎正在使用JOINs进行过滤:将JOIN ( SELECT ... )变为WHERE EXISTS ( SELECT * ... )

请用英文描述查询尝试做什么。

另一种方法,建立在Gordon建议的共同SELECT上:将常用的SELECT放入TEMPORARY表中;添加索引,然后从中查询。