我一直致力于重构一堆困扰我们系统的应用程序逻辑和SQL。
我设法摆脱了大部分应用程序层逻辑,并在SQL查询中完成了所有操作,但似乎有些滞后,我不确定原因。
SELECT
st.id ownerId,
st.display_name ownerLabel,
COALESCE((score.mean_score / score.num_responses) * 100, 0) meanScore,
COALESCE((score.top_box_percentage / score.num_responses) * 100, 0) topBoxPercentage,
COALESCE(score.num_responses, 0) sampleSize
FROM question q
CROSS JOIN store st
LEFT JOIN
(SELECT
COUNT(ch.id) num_responses,
SUM(ans.mean_score_weight) mean_score,
SUM(ans.is_top_box) top_box_percentage,
q.id question_id,
q.category_id category_id,
st.id store_id
FROM choice ch
INNER JOIN response r ON r.id = ch.response_id
INNER JOIN answer ans ON ans.id = ch.answer_id
INNER JOIN store st ON st.id = r.store_id
INNER JOIN question q ON q.id = ans.question_id
WHERE r.survey_id = 96 AND r.created_at BETWEEN '2015-01-01' AND '2015-03-01' AND q.is_scorable AND ans.is_scorable
GROUP BY q.id, st.id
) score ON score.question_id = q.id AND score.store_id = st.id
WHERE q.survey_id = 96 AND q.is_scorable
GROUP BY q.id, st.id;
此查询的预期执行计划如下:
+----+-------------+------------+--------+------------------------------------------------+------------------------+---------+-----------------------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+--------+------------------------------------------------+------------------------+---------+-----------------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | q | ref | question_FI_6 | question_FI_6 | 4 | const | 77 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | st | ALL | NULL | NULL | NULL | NULL | 339 | 100.00 | Using join buffer |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3505 | 100.00 | |
| 2 | DERIVED | r | ref | PRIMARY,response_FI_3,response_FI_4 | response_FI_3 | 5 | | 5179 | 100.00 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | st | eq_ref | PRIMARY | PRIMARY | 4 | titan.r.store_id | 1 | 100.00 | Using index |
| 2 | DERIVED | ch | ref | unique_response_answer,choice_FI_1,choice_FI_3 | unique_response_answer | 4 | titan.r.id | 35 | 100.00 | Using index |
| 2 | DERIVED | ans | eq_ref | PRIMARY,answer_FI_1 | PRIMARY | 4 | titan.ch.answer_id | 1 | 100.00 | Using where |
| 2 | DERIVED | q | eq_ref | PRIMARY | PRIMARY | 4 | titan.ans.question_id | 1 | 100.00 | Using where |
+----+-------------+------------+--------+------------------------------------------------+------------------------+---------+-----------------------+------+----------+----------------------------------------------+
在我看来,查询速度慢的原因是response
上的filesort +临时排序表。我对MySQL的经验相当有限,所以我不确定如何解决这个问题。任何帮助将不胜感激。
response
索引:
+----------+------------+--------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+--------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| response | 0 | PRIMARY | 1 | id | A | 53911 | NULL | NULL | | BTREE | | |
| response | 1 | response_FI_3 | 1 | survey_id | A | 104 | NULL | NULL | YES | BTREE | | |
| response | 1 | response_FI_4 | 1 | store_id | A | 523 | NULL | NULL | YES | BTREE | | |
| response | 1 | fk_response_competition_id_idx | 1 | competition_id | A | 13 | NULL | NULL | YES | BTREE | | |
+----------+------------+--------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
虽然我写这篇文章是因为我可以利用所使用的密钥(response_FI_3 = r.survey_id)来排除文件分区,这会产生更好的结果,但我还是认为可以采取更多措施来改进这一查询。
感谢所提供的任何输入。
答案 0 :(得分:1)
如果您看到Using filesort
表示您的查询正在使用where子句中没有索引的列。从我所看到的,created_at
列可能是罪魁祸首。您在WHERE
中使用该列,但您没有索引。你在question
表中遇到了类似的问题但没有该表上的索引列表我无法告诉你它在哪里。
答案 1 :(得分:0)
LEFT
可能导致问题。你能摆脱它吗?请注意,优化程序(在EXPLAIN中查看)无法以子查询开头。
survey_id = 96 AND r.created_at
说r
需要&#39;复合索引&#39; INDEX(survey_id, created_at)
。请做SHOW CREATE TABLE。
你真的没有CROSS JOIN
,所以让我建议改写:
SELECT ...
FROM ( SELECT ... ) AS score
JOIN store AS st ON score.store_id = st.id
JOIN question q ON score.question_id = q.id
WHERE q.survey_id = 96 AND q.is_scorable
GROUP BY q.id, st.id;
BETWEEN '2015-01-01' AND '2015-03-01'
- 如果该列是&#34; DATE&#34;,则该范围错误地(?)包括在3月1日。如果它是ID&#34; DATETYPE&#34;,则包含和额外午夜。
您是否自己尝试过子查询?没关系,还是我们应该调查一下?请为每个表提供SHOW CREATE TABLE。