MySQL糟糕的OR和ISNULL性能

时间:2016-10-02 14:09:34

标签: mysql sql performance

我对一些奇怪的mysql性能行为感到非常惊讶。我的以下查询大约需要3个小时才能运行:

UPDATE ips_invoice AS f SET ips_locality_id = (
        SELECT ips_locality_id 
        FROM ips_user_unit_locality AS uul 
        JOIN ips_user AS u ON u.id = uul.ips_user_id 
        WHERE 
            (u.id = f.ips_user_id OR u.ips_user_id_holder = f.ips_user_id) AND 
            uul.date <= f.date 

        ORDER BY `date` DESC 
        LIMIT 1 
) 
WHERE f.ips_locality_id IS NULL;

我也尝试了以下方法,但获得了相同的效果结果:

UPDATE ips_invoice AS f SET ips_locality_id = (
        SELECT ips_locality_id 
        FROM ips_user_unit_locality AS uul 
        JOIN ips_user AS u ON u.id = uul.ips_user_id 
        WHERE 
            IFNULL(u.ips_user_id_holder, u.id) = f.ips_user_id 
            AND 
            uul.date <= f.date 

        ORDER BY `date` DESC 
        LIMIT 1 
) 
WHERE f.ips_locality_id IS NULL;

逻辑是:如果&#34; ips_user_id_holder&#34; column不是null,我应该使用它,如果不是我应该使用&#34; id&#34;列。

如果我将查询拆分为两个查询,每个查询需要15秒才能运行:

     UPDATE ips_invoice AS f SET ips_locality_id = (
                SELECT ips_locality_id 
                FROM ips_user_unit_locality AS uul 
                JOIN ips_user AS u ON u.id = uul.ips_user_id 
                WHERE 
                    u.ips_user_id_holder = f.ips_user_id 
                    AND 
                    uul.date <= f.date 

                ORDER BY `date` DESC 
                LIMIT 1 
        ) 
        WHERE f.ips_locality_id IS NULL;

UPDATE ips_invoice AS f SET ips_locality_id = (
                SELECT ips_locality_id 
                FROM ips_user_unit_locality AS uul 
                JOIN ips_user AS u ON u.id = uul.ips_user_id 
                WHERE 
                    u.id = f.ips_user_id 
                    AND 
                    uul.date <= f.date 

                ORDER BY `date` DESC 
                LIMIT 1 
        ) 
        WHERE f.ips_locality_id IS NULL;

这不是我第一次遇到Mysql&#34; OR&#34;或&#34;空检查&#34;在 相对简单的查询(Why this mysql query (with is null check) is so slower than this other one?)。

ips_invoice表有大约400,000条记录,ips_user_unit_locality大约有100.000条记录,ips_user有大约35.000条记录。

我在Ubuntu Amazon EC2实例中运行MySQL 5.5.49。

那么,第一次和第二次查询有什么问题?造成重大差异的原因是什么?

2 个答案:

答案 0 :(得分:1)

第一次和第二次查询没有“错误”。但是,当您在or条件(或等效地,相关子查询条件)中使用join时,引擎通常无法使用索引。

这使得一切都很慢。

您似乎至少了解一种解决方法,因此我不会提出任何其他建议。

编辑:

我会注意到您的查询与您在文本中指定的内容完全不同。它获取两个用户ID中的任何一个的最新日期。您似乎想要优先考虑ID。如果是这样,这更像是您想要的查询:

UPDATE ips_invoice f
    SET ips_locality_id =
        COALESCE( (SELECT ips_locality_id 
                   FROM ips_user_unit_locality uul JOIN
                        ips_user u
                        ON u.id = uul.ips_user_id 
                   WHERE u.ips_user_id_holder, f.ips_user_id AND
                         uul.date <= f.date 
                   ORDER BY uul.date DESC
                   LIMIT 1
                  ),
                  (SELECT ips_locality_id 
                   FROM ips_user_unit_locality uul
                   WHERE uul.ips_user_id = f.ips_user_id AND
                         uul.date <= f.date 
                   ORDER BY uul.date DESC
                   LIMIT 1
                  )
                )
WHERE f.ips_locality_id IS NULL;

答案 1 :(得分:0)

  1. 使用多表格UPDATE代替= ( SELECT ...)

  2. 而不是OR,请写两个单独的UPDATEs