我有一个超慢查询,我在这里发布:http://pastebin.com/E5sdRi7e。当我做一个EXPLAIN时,我得到了以下内容:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 5 Using filesort
2 DERIVED Workflow ALL PRIMARY NULL NULL NULL 9 Using temporary; Using filesort
2 DERIVED <derived3> ALL NULL NULL NULL NULL 141 Using where; Using join buffer
2 DERIVED DataSource ALL PRIMARY NULL NULL NULL 1310 Using where; Using join buffer
2 DERIVED <derived4> ALL NULL NULL NULL NULL 1310 Using where; Using join buffer
2 DERIVED User eq_ref PRIMARY PRIMARY 4 LatestDataSourceActivityLog.UserId 1
4 DERIVED t1 ALL NULL NULL NULL NULL 5400 Using where; Using temporary; Using filesort
5 DEPENDENT SUBQUERY t2 ref DataSourceId DataSourceId 4 companyname_db.t1.DataSourceId 4
3 DERIVED DataSource range PRIMARY PRIMARY 4 NULL 142 Using where
上表告诉我什么?它是否有助于我确定哪些字段应编入索引?
非常感谢任何帮助。
查询
SELECT WrappedData.*
FROM (SELECT ParentLeafNodeDataSource.Id,
LatestDataSourceActivityLog.UserId,
DataSource.Status AS StatusCode,
( CASE
WHEN User.Name IS NULL THEN 'CompanyName'
ELSE User.Name
END ) AS `Username`,
Workflow.Name AS WorkflowName,
LatestDataSourceActivityLog.Timestamp
FROM DataSource,
Workflow,
(SELECT *
FROM DataSource
WHERE DataSource.Id IN ( 0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11,
12, 13, 16, 21,
22, 23, 24, 25,
26, 27, 28, 29,
30, 31, 32, 33,
34, 35, 36, 37,
38, 39, 40, 41,
42, 43, 44, 45,
46, 47, 48, 49,
50, 51, 52, 53,
54, 55, 56, 57,
58, 59, 60, 61,
62, 63, 64, 65,
66, 67, 68, 69,
70, 71, 72, 73,
74, 75, 76, 77,
78, 79, 80, 81,
83, 84, 85, 86,
87, 88, 89, 90,
91, 92, 93, 94,
95, 96, 97, 98,
99, 100, 101, 102,
103, 104, 105, 106,
107, 108, 109, 110,
111, 112, 113, 114,
115, 116, 117, 118,
119, 120, 142, 1293,
1294, 1295, 1296, 1297,
1298, 1299, 143, 1300,
1301, 1302, 1303, 1304,
1305, 1306, 144, 146,
145, 1307, 1308, 1309,
1310, 147, 149, 148,
150, 151 )) AS ParentLeafNodeDataSource,
(SELECT t1.*
FROM DataSourceActivityLog AS t1
WHERE Timestamp = (SELECT Max(t2.Timestamp)
FROM DataSourceActivityLog AS t2
WHERE t1.DataSourceId = t2.DataSourceId)
GROUP BY t1.DataSourceId) AS LatestDataSourceActivityLog
LEFT JOIN User
ON User.Id = LatestDataSourceActivityLog.UserId
WHERE ParentLeafNodeDataSource.Status = '203'
OR ParentLeafNodeDataSource.Status = '204'
AND Workflow.Id = ParentLeafNodeDataSource.WorkflowId
AND LatestDataSourceActivityLog.DataSourceId = ParentLeafNodeDataSource.Id
AND DataSource.Id = LatestDataSourceActivityLog.DataSourceId
AND LatestDataSourceActivityLog.UserId = 1
GROUP BY ParentLeafNodeDataSource.Id) AS WrappedData
ORDER BY WrappedData.`Timestamp` DESC
答案 0 :(得分:2)
很难确切地说,但这里有几个重构的东西。
关于性能,首先要看的是GROUP函数。
(SELECT t1.*
FROM DataSourceActivityLog AS t1
WHERE Timestamp = (SELECT Max(t2.Timestamp)
FROM DataSourceActivityLog AS t2
WHERE t1.DataSourceId = t2.DataSourceId)
GROUP BY t1.DataSourceId) AS LatestDataSourceActivityLog
这可以完全消除MAX的使用
(SELECT t1.*
FROM DataSourceActivityLog AS t1
WHERE Timestamp = (SELECT t2.Timestamp
FROM DataSourceActivityLog AS t2
WHERE t1.DataSourceId = t2.DataSourceId
ORDER BY t2.Timestamp DESC
LIMIT 1)
GROUP BY t1.DataSourceId) AS LatestDataSourceActivityLog
可能不是一个大的性能问题,但在这里你可以使用IFNULL或COALESCE而不是CASE。
( CASE
WHEN User.Name IS NULL THEN 'CompanyName'
ELSE User.Name
END )
相反
( IFNULL(User.Name,'CompanyName' )
就索引而言,它们通过简化查找来提高SELECT性能,但由于索引也必须更新,因此它们会降低写入操作的速度。如果您的应用程序没有大量写入,则应该对常用的列进行索引,尤其是在大型表中。
在这个查询中,看起来你可以通过向DataSourceId添加索引来获益,但我无法测试是否有任何收益。主键已经被编入索引。
答案 1 :(得分:1)
我会尝试以下方法:
快速尝试(我不确定结果是否相同)
SELECT
dsa.Status AS StatusCode,
dsb.Id,
dsl.UserId,
dsl.Timestamp
wf.Name AS WorkflowName,
COALESCE(u.Name, 'CompanyName') AS `Username`
FROM
DataSource dsa
INNER JOIN DataSource dsb
ON dsb.Id IN ( 0, 1, 2, 3, 4, 5, 6, 7, etc ))
AND dsb.Status = '203' OR dsb.Status = '204'
INNER JOIN DataSourceActivityLog dsl
ON dsl.DataSourceId=dsa.Id
AND dsl.DataSourceId=dsb.Id
AND dsl.UserId = 1
AND dsl.Timestamp=(
SELECT MAX(t2.Timestamp)
FROM DataSourceActivityLog AS dslt
WHERE dslt.DataSourceId = dsl.DataSourceId
)
INNER JOIN Workflow wf
ON wf.Id = dsb.WorkflowId
LEFT JOIN User u
ON u.Id = dsl.UserId
GROUP BY
dsl.Id
ORDER BY
dsl.Timestamp DESC
也许使用Zurahn的重构来摆脱子查询中的GROUP BY
索引为:
SELECT
dsa.Status AS StatusCode,
dsb.Id,
dsl.UserId,
dsl.Timestamp
wf.Name AS WorkflowName,
COALESCE(u.Name, 'CompanyName') AS `Username`
FROM
DataSource dsb
INNER JOIN Workflow wf
ON dsb.WorkflowId=wf.Id
INNER JOIN DataSourceActivityLog dsl
ON dsl.DataSourceId=dsb.Id
AND dsl.UserId=1
AND dsl.Timestamp=(
SELECT MAX(t2.Timestamp)
FROM DataSourceActivityLog AS dslt
WHERE dslt.DataSourceId = dsl.DataSourceId
)
INNER JOIN DataSource dsa
ON dsl.DataSourceId=dsa.Id
LEFT JOIN User u
ON dsl.UserId=u.Id
WHERE
dsb.Id IN ( 0, 1, 2, 3, 4, 5, 6, 7, etc ))
AND dsb.Status = '203' OR dsb.Status = '204'
GROUP BY
dsl.Id
ORDER BY
dsl.Timestamp DESC
答案 2 :(得分:0)