我有一个嵌套连接的SQL查询:
SELECT rh.host, rh.report, COUNT(results.id), COUNT(results_2.id), COUNT(results_3.id), COUNT(results_4.id)
FROM report_hosts rh
INNER JOIN report_results rr ON rh.report = rr.report
LEFT OUTER JOIN results ON rr.result = results.id AND results.type = 'Hole' AND results.host = rh.host
LEFT OUTER JOIN results results_2 ON rr.result = results_2.id AND results_2.type = 'Warning' AND results_2.host = rh.host
LEFT OUTER JOIN results results_3 ON rr.result = results_3.id AND results_3.type = 'Note' AND results_3.host = rh.host
LEFT OUTER JOIN results results_4 ON rr.result = results_4.id AND results_4.type = 'Log' AND results_4.host = rh.host
GROUP BY rh.host
查询原样需要大约5秒,99.7%复制到临时表。完整查询的EXPLAIN
显示为:
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
| 1 | SIMPLE | rr | ALL | report | NULL | NULL | NULL | 3139 | Using temporary; Using filesort |
| 1 | SIMPLE | rh | ref | report | report | 5 | openvas.rr.report | 167 | Using where |
| 1 | SIMPLE | results | eq_ref | PRIMARY,type | PRIMARY | 4 | openvas.rr.result | 1 | |
| 1 | SIMPLE | results_2 | eq_ref | PRIMARY,type | PRIMARY | 4 | openvas.rr.result | 1 | |
| 1 | SIMPLE | results_3 | eq_ref | PRIMARY,type | PRIMARY | 4 | openvas.rr.result | 1 | |
| 1 | SIMPLE | results_4 | eq_ref | PRIMARY,type | PRIMARY | 4 | openvas.rr.result | 1 | |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
当我删除LEFT JOIN
时,查询执行大约1秒,每个LEFT JOIN
增加大约一个额外的执行时间。
我的问题:
任何人都可以解释,如果有更多LEFT JOIN
s,为什么一个连接的复制到临时表任务需要更长的时间? MySQL是否为每个JOIN多次复制临时表?
我该如何避免这种情况?我错过了一个索引吗?
我打算完成的任务: 我有一张桌子,上面有几台主机的扫描结果。每个结果都分为类型(“孔”,“警告”,“注释”或“日志”)。我想选择每个主机以及相应数量的孔,警告,注释和日志。作为一种“限制”,我有一个事实,即并非每个主机都有各种类型的结果。
答案 0 :(得分:3)
您多次加入一个表,实际上就像加入多个表一样。您应该能够使用一些case语句和where子句来处理它。 (实际上你可能不需要where子句。)
SELECT rh.host, rh.report,
COUNT(CASE WHEN results.type = 'Hole' THEN 1 ELSE NULL END) as Holes,
COUNT(CASE WHEN results.type = 'Warning' THEN 1 ELSE NULL END) as Warnings,
COUNT(CASE WHEN results.type = 'Note' THEN 1 ELSE NULL END) as Notes,
COUNT(CASE WHEN results.type = 'Log' THEN 1 ELSE NULL END) as Logs
FROM
report_hosts rh
INNER JOIN
report_results rr
ON
rh.report = rr.report
LEFT OUTER JOIN
results
ON
rr.result = results.id
AND results.host = rh.host
WHERE
results.type = 'Hole'
OR results.type = 'Warning'
OR results.type = 'Note'
OR results.type = 'Log'
GROUP BY rh.host, rh.report
案例陈述,即IME,并不是表现最好的,但是来自众多联接的数据膨胀可能会抵消这种情况并提供更好的性能。
答案 1 :(得分:1)
使用大量数据(在您的情况下是额外的left join
)将意味着将其存储在内存中。
如果耗尽缓冲区,则需要将查询存储到驱动器上的临时结果表中。
尝试使用相同数量的left join
,但使用limit
限制行数。它应该确认问题在于缓冲区(意味着它会运行得更快)。