Question

我有两个查询，每个查询都可以自己运行得非常快（少于2秒）。但是，当我尝试将它们作为子查询加入时，它的运行速度非常慢。我最后一次跑它需要大约68秒。这是完整的查询：

SELECT t.count,
       t.total
  FROM (SELECT t.account_number,
               COUNT(t.id) count,
               SUM(t.amount) total,
               ib.id import_bundle_id
          FROM import_bundle ib
          JOIN generic_import gi ON gi.import_bundle_id = ib.id
          JOIN transaction_import ti ON ti.generic_import_id = gi.id
          JOIN account_transaction t ON t.transaction_import_id = ti.id
          JOIN transaction_code tc ON t.transaction_code_id = tc.id
         WHERE tc.code IN (0, 20, 40)
      GROUP BY t.account_number) t
  JOIN (SELECT a.account_number,
               np.code
          FROM import_bundle ib
          JOIN generic_import gi ON gi.import_bundle_id = ib.id
          JOIN account_import ai ON ai.generic_import_id = gi.id
          JOIN account a ON a.account_import_id = ai.id
          JOIN account_northway_product anp ON anp.account_id = a.id
          JOIN northway_product np ON anp.northway_product_id = np.id
         WHERE np.code != 'O1') a ON t.account_number = a.account_number

这个查询应该运行缓慢并不是一个惊喜。如果这些是两个单独的表而不是子查询，我会将索引放在account_number列上。但是，显然不可能在查询结果上放置索引，所以我不能这样做。我怀疑这是问题的一部分。

除此之外，我不明白为什么查询速度慢，除了添加两个汇总表之外我对如何加快速度没有任何想法，如果我不这样做，我不想这样做不得不。

顺便说一句，这个英文查询可能是“为所有不透支保护账户的账户（代码O1）获取所有POS交易（代码0,20和40）。”

Answer 1

将支票放入主查询：

SELECT  t.account_number,
        COUNT(t.id) count,
        SUM(t.amount) total,
        ib.id import_bundle_id
FROM    transaction_code tc
JOIN    account_transaction t
ON      t.transaction_code_id = tc.id
JOIN    transaction_import ti
ON      ti.id = t.transaction_import_id
JOIN    generic_import gi
ON      gi.id = ti.generic_import_id
JOIN    import_bundle ib
ON      ib.id = gi.import_bundle_id
WHERE   tc.code IN (0, 20, 40)
        AND t.account_number NOT IN
        (
        SELECT  anp.id
        FROM    account_northway_product anp
        JOIN    northway_product np
        ON      np.id = anp.northway_product_id
        WHERE   np.code = '01'
        )
GROUP BY
        t.account_number

创建以下索引：

transaction_code (code)
account_transaction (transaction_code_id)
account_transaction (account_number)

Answer 2

由于总体复杂性，MySQL似乎可能错误地优化了查询。

您可能会发现最好的选择是分阶段运行。

SELECT * INTO #query1 FROM <query1>
SELECT * INTO #query2 FROM <query2>
SELECT * FROM #query1 INNER JOIN #query2 ON <predicate>

另一种选择可能是使用HINTS，虽然我在MySQL中并不熟悉它们。基本上，找出为每个子查询单独生成的计划，然后强制不同的连接使用NESTED LOOP JOIN，MERGE JOIN等。这限制了优化器可以做什么，因此得到'错误'，但也限制了它的能力随着数据统计的变化而做出响应。

Answer 3

基于我从您的查询中查看;

1）看起来您的PK是import_bundle表上的帐号。您需要该列上的聚簇索引 2）在（code）northway_product表和（code）表上添加索引也会有所帮助。

Answer 4

由于大多数表已经加入，为什么要再次加入...只需添加已设置ID的其他表。此外，由于所有都是JOIN而且没有一个是LEFT连接，因此暗示只获取在所有连接中找到的记录（因为您的原始查询最终也加入了结果集）。

此外，通过添加AND np.code！='01'，它将立即排除这些条目，从而只在您的查询中提取所需的条目。因此，这个内部的“PreQuery”（别名PQ）为您完成所有工作。但是，您的group by不包含导入包ID，并且可能通过错误的答案给您。根据需要调整。然后结果只是拉出两列......计数和总数将返回多行，但帐户或总计没有上下文，但您可以根据需要进行调整。

SELECT PQ.count,
       PQ.total
  FROM (SELECT STRAIGHT_JOIN
            t.account_number,
            COUNT(t.id) count,
            SUM(t.amount) total,
            ib.id import_bundle_Id
         FROM 
            transaction_code tc
               join account_transaction t
                  on tc.id = t.transaction_import_id
               join transaction_import ti
                  on t.transaction_import_id = ti.id
               join generic_import gi
                  on ti.generic_import_id = gi.id
               join import_bundle ib
                  on gi.import_bundle_id = ib.id
               join account_import ai 
                  on gi.id = ai.generic_import_id
               join account a
                  on ai.id = a.account_import_id
               join account_northway_product anp
                  on a.id = anp.account_id
               join northway_product np
                  on anp.northway_product_id = np.id
                  AND np.code != '01'
         where
            tc.code in ( 0, 20, 40 )
         group by 
            t.account_number ) PQ

两个查询分开很快，当作为子查询加入时很慢

4 个答案: