让db1和db2 db1.table1
annee code code2 var1 ....
1991 11 12 779
1991 11 14 105
1991 11 15 10
1991 12 11 466
1991 12 14 296
1991 12 15 270
1991 14 11 15
1991 14 12 510
1991 14 15 6
1991 15 11 193
1991 15 12 455
1991 15 14 4
....
1992 11 12 779
1992 11 14 105
1992 11 15 10
1992 12 11 466
1992 12 14 296
1992 12 15 270
1992 14 11 15
1992 14 12 510
1992 14 15 6
1992 15 11 193
1992 15 12 455
1992 15 14 4
....
db2.table2
var1 code ...
test 11
test 12
test 14
test2 11
test2 14
test2 15
...
我需要优化以下查询(因为db1.table1包含8 000 000行):
select annee,sum(var1) from db1.table1 as M where
M.code in
(select t1.code from db2.table2 as t1 cross join db2.table2 as t2 where t1.var1='Test2' and t2.var1='Test2' and t1.code <> t2.code)
and M.code2 in
(select t2.code from db2.table2 as t1 cross join db2.table2 as t2 where t1.var1='Test2' and t2.var1='Test2' and t1.code <> t2.code)
group by annee order by annee desc
对db1.table1和db2.table2进行索引和排序。 任何建议将不胜感激! 感谢
答案 0 :(得分:2)
作为变体,您可以尝试以下
select m.annee,sum(m.var1)
from db1.table1 m
join
(
select t1.code code1,t2.code code2
from db2.table2 t1
join db2.table2 t2 on t1.var1='Test2' and t2.var1=t1.var1 and t1.code<t2.code
) c
on (m.code=c.code1 and m.code2=c.code2) or (m.code=c.code2 and m.code2=c.code1)
group by m.annee
order by m.annee desc
我使用JOIN
而不是CROSS JOIN
和JOIN
而不是IN
。
如果它适合你,你可以尝试优化查询
select m.annee,sum(m.var1)
from db2.table2 t1
join db2.table2 t2 on t1.var1='Test2' and t2.var1=t1.var1 and t1.code<t2.code
join db1.table1 m on (m.code=t1.code and m.code2=t2.code) or (m.code=t2.code and m.code2=t1.code)
group by m.annee
order by m.annee desc
第一个JOIN
会返回test2
的所有组合。有(11,12)和(11,14)
db2.table2 t1
join db2.table2 t2 on t1.var1='Test2' and t2.var1=t1.var1 and t1.code<t2.code
第二个JOIN
检查table1
对这些组合的行
join db1.table1 m on (m.code=t1.code and m.code2=t2.code) or (m.code=t2.code and m.code2=t1.code)
尝试检查下一个变种
select m.annee,sum(m.var1)
from db2.table2 t1
join db2.table2 t2 on t1.var1='Test2' and t2.var1=t1.var1 and t1.code<>t2.code
join db1.table1 m on m.code=t1.code and m.code2=t2.code
group by m.annee
order by m.annee desc
如果最后一个变体返回正确的结果,那么您可以尝试将(code,code2)
的索引添加到table1
CREATE INDEX idx_table1_code_code2 ON db1.table1 (code,code2)
答案 1 :(得分:1)
我试图让你的查询逻辑更简单。希望这个帮助
select annee,sum(var1)
from db1.table1 as M where
exists( select var1 from db2.table2 t2
where t2.var1='Test2'
group by t2.var1
having sum(t2.code = M.code) >= 1
and sum(t2.code = M.code2) >= 1
and (M.code != M.code2 or sum(t2.code != M.code) >= 1))
group by annee
order by annee desc
答案 2 :(得分:0)
table2: INDEX(var1, code)
table1: INDEX(code, code2, annee)
将IN ( SELECT ... )
更改为JOIN ( SELECT ... ) ON ...
;前者的优化程度很低。
如果您使用的是MySQL 5.6或更高版本,则可以充分优化子查询 。如果您使用的是旧版本,请使用该重复的子查询创建TEMPORARY TABLE
。