您好我需要帮助来优化大于1百万的大型数据库记录的查询。当前查询需要27-30秒才能执行。
SELECT SQL_CALC_FOUND_ROWS
candidate.candidate_id AS candidateID,
candidate.candidate_id AS exportID,
candidate.is_hot AS isHot,
candidate.date_modified AS dateModifiedSort,
candidate.date_created AS dateCreatedSort,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state,
candidate.key_skills AS keySkills,
owner_user.first_name AS ownerFirstName,
owner_user.last_name AS ownerLastName,
CONCAT(owner_user.last_name,
owner_user.first_name) AS ownerSort,
DATE_FORMAT(candidate.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(candidate.date_modified, '%m-%d-%y') AS dateModified,
candidate.email2 AS email2 FROM
candidate
LEFT JOIN
user AS owner_user ON candidate.owner = owner_user.user_id
LEFT JOIN
saved_list_entry ON saved_list_entry.data_item_type = 100
AND saved_list_entry.data_item_id = candidate.candidate_id WHERE
is_active = 1 GROUP BY candidate.candidate_id ORDER BY dateModifiedSort
DESC LIMIT 0 , 15
是否有任何方法可以减少查询的执行时间。我还在表中添加了索引,但它没有正常工作。
答案 0 :(得分:1)
您正在使用查询模式
SELECT a vast bunch of stuff
FROM a complex assembly of JOIN operations
ORDER BY some variable DESC
LIMIT 0,small number
这本质上是低效的:为了满足你的查询,MySQL服务器必须构造一个庞大的结果集,然后它必须对整个事物进行排序,然后它需要前15行并丢弃其余部分。
为了提高效率,您需要减少排序。这是一种方法。看起来你想找到最近修改过的十五名候选人。该查询将非常便宜地检索那些候选者的ID。它利用了你的一个索引。
SELECT candidate_id
FROM candidate
ORDER BY date_modified DESC
LIMIT 0, 15
然后,您可以将其用作主查询中的子查询。添加如下的子句:
WHERE candidate.candidate_id IN (
SELECT candidate_id
FROM candidate
ORDER BY date_modified DESC
LIMIT 0, 15)
在适当的地方查询。
另请注意,您使用的是nonstandard and potentially harmful MySQL specific extension to GROUP BY。您的查询有效,但如果候选人拥有多个所有者,则在随机选择后只返回一个。
最后,您似乎已在大表中的许多列上放置了单列索引。这是一个臭名昭着的SQL反模式:所有这些索引都会降低INSERT和UPDATE操作的速度,而且大多数这些操作可能没有加快查询的速度。当然,对于此查询,唯一有用的索引是date_modified
上的索引和主键。
使用特定的多列索引可以最好地满足许多复杂查询。一堆单列索引对此类查询没有帮助。
答案 1 :(得分:1)
我已经更改了下面查询中的表别名,使用它 这必须解决你的问题
SELECT SQL_CALC_FOUND_ROWS
candidate.candidate_id AS candidateID,
candidate.candidate_id AS exportID,
candidate.is_hot AS isHot,
candidate.date_modified AS dateModifiedSort,
candidate.date_created AS dateCreatedSort,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state,
candidate.key_skills AS keySkills,
user.first_name AS ownerFirstName,
user.last_name AS ownerLastName,
CONCAT(user.last_name,
user.first_name) AS ownerSort,
DATE_FORMAT(candidate.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(candidate.date_modified, '%m-%d-%y') AS dateModified,
candidate.email2 AS email2 FROM
candidate
LEFT JOIN
user ON candidate.owner = user.user_id
LEFT JOIN
saved_list_entry ON saved_list_entry.data_item_type = 100
AND saved_list_entry.data_item_id = candidate.candidate_id WHERE
is_active = 1 GROUP BY candidate.candidate_id ORDER BY dateModifiedSort
DESC LIMIT 0 , 15
使用以下查询为加入条件
创建索引create index index_user user(user_id);
create index index_saved_list_entry saved_list_entry(data_item_type,data_item_id);
create index index_candidate candidate(is_active,candidate_id,dateModifiedSort);
答案 2 :(得分:1)
首先,一个候选人,我怀疑ID始终只是一个条目,所以你为什么要做GROUP BY超出我的意思,这可以被删除并改善一点。
其次,您正在对“saved_list_entry”表进行左连接,但实际上没有从中拉出任何列,因此可能会完全删除。
第三,考虑到GROUP BY不再适用,我建议将索引更新为:
table index
CANDIDATE ( is_active, date_modified, candidate_id, owner )
user ( user_id )
saved_list_entry ( data_item_id, data_item_type )
由于您的订单是按降序修改的日期,让IT处于is_active(Where条件)的第二个位置,它将快速浏览您的前15个。但是,您的SQL_CALC_FOUND_ROWS仍然需要遍历所有其他限定条件,但结果集将由索引预先排序以匹配。
SELECT SQL_CALC_FOUND_ROWS
c.candidate_id AS candidateID,
c.candidate_id AS exportID,
c.is_hot AS isHot,
c.date_modified AS dateModifiedSort,
c.date_created AS dateCreatedSort,
c.first_name AS firstName,
c.last_name AS lastName,
c.city AS city,
c.state AS state,
c.key_skills AS keySkills,
u.first_name AS ownerFirstName,
u.last_name AS ownerLastName,
CONCAT(u.last_name, u.first_name) AS ownerSort,
DATE_FORMAT(c.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(c.date_modified, '%m-%d-%y') AS dateModified,
c.email2 AS email2
FROM
candidate c
LEFT JOIN user u
ON c.owner = u.user_id
LEFT JOIN saved_list_entry s
ON c.candidate_id = s.data_item_id
AND s.data_item_type = 100
WHERE
c.is_active = 1
GROUP BY
c.candidate_id
ORDER BY
c.date_modified DESC
LIMIT
0, 15
答案 3 :(得分:1)
摆脱saved_list_entry
,它什么都没有。
延迟加入user
。这将让您摆脱GROUP BY
,这会增加一些时间,并可能使FOUND_ROWS()
的价值膨胀。
类似的东西:
SELECT c2.*,
ou.first_name AS ownerFirstName,
ou.last_name AS ownerLastName,
CONCAT(ou.last_name, ou.first_name) AS ownerSort,
FROM
( SELECT SQL_CALC_FOUND_ROWS
c.candidate_id AS candidateID, c.candidate_id AS exportID,
c.is_hot AS isHot, c.date_modified AS dateModifiedSort,
c.date_created AS dateCreatedSort, c.first_name AS firstName,
c.last_name AS lastName, c.city AS city, c.state AS state,
c.key_skills AS keySkills,
DATE_FORMAT(c.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(c.date_modified, '%m-%d-%y') AS dateModified,
c.email2 AS email2
FROM candidate AS c
WHERE is_active = 1
GROUP BY c.candidate_id
ORDER BY c.date_modified DESC -- note change here
LIMIT 0 , 15
) AS c2
LEFT JOIN user AS ou ON c2.owner = ou.user_id;
(我搞砸了列顺序,但你可以解决这个问题。)
需要索引:
candidate: INDEX(is_active, candidate_id, date_modified)