我有一个查询,我使用InnoDB存储引擎。
我想优化它。执行需要太多时间。我的数据库中有500万个数据。现在需要250秒才能执行。
INSERT INTO dynamicgroups (adressid)
SELECT SQL_NO_CACHE DISTINCT(addressid) FROM (
SELECT cluster_0.addressid FROM (
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH))
AND f_0.birthday < CURDATE() )
) group_all
) AS groups
) AS cluster_0
INNER JOIN(
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (marriage_date = ''
OR marriage_date = '1900-01-01'
OR marriage_date = '0000-00-00' )
) group_all
) AS groups
) AS cluster_1 ON cluster_1.addressid = cluster_0.addressid
INNER JOIN(
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.city = '34' )
) group_all
) AS groups
) AS cluster_2 ON cluster_2.addressid = cluster_1.addressid
) AS t
答案 0 :(得分:1)
即使EXPLAIN运算符没有像其他运算符一样实现..我建议您将它用于查询。
之后,您可以分析EXPLAIN提供的结果,并决定应将哪些列编入索引。
有关详细信息,我建议您查看这些来源:
此外,最后2个选项看起来非常相似,也许您可以制作临时表或其中的视图,这样您就不必运行整个选择两次了?
答案 1 :(得分:1)
您的查询似乎都是此查询的变体:
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() )
我建议您使用group by
和having
:
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
table2.emaildata f_0
ON f_0.addressid = g.addressid
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
SUM(marriage_date = '' OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) > 0 AND
SUM(f_0.city = '34' ) > 0;
根据数据量,group by
之前的过滤也可以提供帮助:
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) OR
(marriage_date = '' OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) OR
(f_0.city = '34' )
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
SUM(marriage_date = '' OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) > 0 AND
SUM(f_0.city = '34' ) > 0;
答案 2 :(得分:0)
marriage_date
- 将其设为NULLable并使用NULL而不是&#39;&#39;等等。这样可以避免效率低下OR
和可能导致INDEX
的可用性。
请提供SHOW CREATE TABLE
,以便我们评估当前的索引。
你在运行什么版本?直到非常,最近这个构造非常效率低下:
FROM ( SELECT ... )
JOIN ( SELECT ... )
解决方法是将子查询放入tmp表并添加INDEX
。
此可能会帮助您,因为您似乎正在使用JOINs
进行过滤:将JOIN ( SELECT ... )
变为WHERE EXISTS ( SELECT * ... )
。
请用英文描述查询尝试做什么。
另一种方法,建立在Gordon建议的共同SELECT上:将常用的SELECT放入TEMPORARY表中;添加索引,然后从中查询。