查询1:
select distinct email from mybigtable where account_id=345
需要0.1秒
查询2:
Select count(*) as total from mybigtable where account_id=123 and email IN (<include all from above result>)
需要0.2秒
查询3:
Select count(*) as total from mybigtable where account_id=123 and email IN (select distinct email from mybigtable where account_id=345)
需要22分钟,90%处于“准备”状态。为什么这需要这么多时间。
表是在MySQL 5.0上有3.2mil行的innodb
答案 0 :(得分:0)
还有更多内容:
我怀疑您的查询缓存已针对查询1和2进行了加热,从而产生了错误的结果。在FLUSH QUERY CACHE;
我怀疑查询3将通过临时表,最常见的是磁盘,而查询2保证从RAM运行。 my.cnf中临时表的默认设置非常保守。
试试这个以确保你没有受到MySQL中旧的去优化错误的影响
SELECT count(DISTINCT b.primary_key_column) AS total
FROM mybigtable a
INNER JOIN mybigtable b
ON a.email=b.email
WHERE a.account_id=345
AND b.account_id=123
答案 1 :(得分:0)
MySQL在IN子句中的子查询中非常糟糕。我会把它重写为:
SELECT COUNT(*) as total
FROM mybigtable t
INNER JOIN (
SELECT DISTINCT email
FROM mybigtable
WHERE account_id = 345
) x
ON t.email = x.email
WHERE t.account_id=123
答案 2 :(得分:0)
经过深思熟虑并在dba.se的帮助下,这是最终的查询工作majic。
select count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
LEFT JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
WHERE tbl345.email IS NULL
) A;