我有以下查询:
SELECT DISTINCT
CONCAT(COALESCE(location.google_id, ''),
'-',
COALESCE(locationData.resolution, ''),
'-',
COALESCE(locationData.time_slice, '')) AS google_id
FROM
LocationData AS locationData
JOIN
Location AS location ON location.id = locationData.location_id
WHERE
location.company_google_id = 5679037876797440
AND location.google_id IN (4679055472328704, 6414382784315392, 5747093579759616)
AND locationData.resolution = 8
AND locationData.time_slice >= ((SELECT max(s.time_slice) FROM LocationData as s WHERE s.location_id = location.id ORDER BY s.time_slice ASC) - 255)
AND location.active = TRUE
ORDER BY location.google_id ASC , locationData.time_slice ASC
LIMIT 0 , 101
我在WHERE和ORDER BY子句的所有列上都有索引,并且为(LocationData.time_slice,LocationData.location_id)添加了复合索引
运行说明给出(这给格式化带来了一些挑战,因此希望它能很好地显示出来):
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
1 | PRIMARY | location | range | PRIMARY,google_id_UNIQUE | google_id_UNIQUE | 8 | NULL | 3 | Using index condition; Using where; Using temporary; Using filesort
1 | PRIMARY | locationData | ref | max_time_slice_idx,max_time_slice_idx_desc | max_time_slice_idx | 5 | index2.location.id | 301 | Using where
2 | DEPENDENT SUBQUERY | s | ref | max_time_slice_idx,max_time_slice_idx_desc | max_time_slice_idx | 5 | index2.location.id | 301 | Using index
我知道从属子查询速度很慢,并且愿意接受类似行为的建议,但是我看到此查询运行大约需要92秒,这与我运行的测试数据相差约4个数量级。在将新的复合指数添加到生产中之前。
运行ALTER语句后是否发生索引构建?有什么方法可以检查索引是否正确执行?
两个表的行数:
生产:
位置:6,814
LocationData:13,070,888
测试数据:
位置:626
LocationData:594,780
任何想法或建议都值得赞赏。预先感谢!
答案 0 :(得分:1)
只是一个建议
您可以避免使用内部联接
SELECT DISTINCT
CONCAT(COALESCE(location.google_id, ''),
'-',
COALESCE(locationData.resolution, ''),
'-',
COALESCE(locationData.time_slice, '')) AS google_id
FROM LocationData AS locationData
INNER JOIN Location AS location ON location.id = locationData.location_id
INNER JOIN (
SELECT s.location_id, max(s.time_slice) -255 my_max_time_slice
FROM LocationData as s
GROUP BY s.location_id
) t on t.location_id = Location.id
WHERE
location.company_google_id = 5679037876797440
AND location.google_id IN (4679055472328704, 6414382784315392, 5747093579759616)
AND locationData.resolution = 8
AND locationData.time_slice >= t.my_max_time_slice
AND location.active = TRUE
ORDER BY location.google_id ASC , locationData.time_slice ASC
LIMIT 0 , 101
通过这种方式,您只需要使用一个查询来构建max_time_slice的汇总结果,就可以避免为每个id重复子查询
希望这很有用
答案 1 :(得分:0)
(添加到@scaisEdge的推荐中...)
WHERE l.company_google_id = 5679037876797440
AND l.google_id IN (4679055472328704, 6414382784315392, 5747093579759616)
AND ld.resolution = 8
AND ld.time_slice >= t.my_max_time_slice
AND l.active = TRUE
ORDER BY l.google_id ASC , ld.time_slice ASC
最佳索引是假设子查询需要首先运行。 (对于旧版本的MySQL就是这种情况。)
LocationData: (location_id, time_slice) -- in this order, for the subquery
locationData: (time_slice, resolution, location_id) -- for JOIN
如果id
是PRIMARY KEY
的{{1}},则那里不需要多余的索引。
对于较新的版本,可以实现子查询并可以构建合适的索引。在这种情况下,它可能以location
开头:
location
由于location: (company_google_id, active, -- in either order
google_id) -- last
locationData: (location_id, time_slice) -- in this order (for subquery)
locationData: (location_id, resolution -- in either order (for JOIN)
time_slice) -- last
会命中两个表,因此无法对其进行优化,也无法避免排序。
建议您添加所有这些索引,如果需要进一步讨论,请获取ORDER BY
。 EXPLAIN SELECT ...
也很方便。