我有一个IMDB数据库,我正在尝试计算每年制作的平均演员数。问题是关于从子查询中选择的速度差异。
我的查询是:
SELECT AVG(sub.num)
FROM
(SELECT
COUNT(production_cast.person_id) AS num,
production.production_year AS pyear
FROM production_cast
INNER JOIN production ON production.id = production_cast.production_id
GROUP BY production.id) sub
GROUP BY(sub.pyear)
然而,为了简化,这些是关于以下问题的两个查询:
使用子查询
SELECT sub.num
FROM
(SELECT
COUNT(production_cast.person_id) AS num,
production.production_year AS pyear
FROM production_cast
INNER JOIN production ON production.id = production_cast.production_id
GROUP BY production.id) sub
没有子查询:
SELECT
COUNT(production_cast.person_id) AS num,
production.production_year AS pyear
FROM production_cast
INNER JOIN production ON production.id = production_cast.production_id
GROUP BY production.id
第二个持续时间不到一秒,第一个永远不会完成。 - 超过5分钟。
具有子查询
的EXPLAIN+-------------+------------------+-------+-----------------------------------+-------------+
| select_type | table | type | key | Extra |
+-------------+------------------+-------+-----------------------------------+-------------+
| PRIMARY | <derived2> | ALL | NULL | NULL |
| DERIVED | production | index | idx_Production_id_production_year | Using index |
| DERIVED | production_cast | ref | production_id | NULL |
+-------------+------------------+-------+-----------------------------------+-------------+
没有子查询的EXPLAIN:
+-------------+-----------------+-----------------------------------+------------+
| select_type | table | key | Extra |
+-------------+-----------------+-----------------------------------+------------+
| SIMPLE | production | idx_Production_id_production_year | Usingindex |
| SIMPLE | production_cast | production_id | NULL |
+-------------+-----------------+-----------------------------------+------------+
这种性能差异背后的原因是什么?可以做些什么来阻止它?
答案 0 :(得分:0)
派生:在临时表中跳转,没有索引的可能性
子查询:像瘟疫一样避免
来自https://www.percona.com/blog/2006/08/31/derived-tables-and-views-performance/
“要注意的是事实派生表甚至要实现EXPLAIN语句。所以如果你在select in from子句中做错了,即忘记了连接条件你可能会有EXPLAIN永远运行。”