我有三张桌子。
一个表包含大约75,000行的提交
一个表包含提交评级,并且仅具有< 10排
一个表包含submission =>竞争映射和我的测试数据也有大约75,000行。
我想做的是
在一轮比赛中获得前50名提交。 Top被列为最高平均评分,其次是最高票数
以下是我使用的查询有效,但问题是需要45秒才能完成!我分析了查询(结果在底部),瓶颈是将数据复制到tmp表然后对其进行排序,以便如何加快速度呢?
SELECT `submission_submissions`.*
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
CREATE TABLE `submission_submissions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(11) NOT NULL,
`goals` text,
`submission` text NOT NULL,
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`date_deleted` datetime DEFAULT NULL,
`cover_image` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `genre` (`genre`),
KEY `account_id` (`account_id`),
KEY `date_created` (`date_created`)
) ENGINE=InnoDB AUTO_INCREMENT=115037 DEFAULT CHARSET=latin1;
CREATE TABLE `submission_ratings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`stars` tinyint(1) NOT NULL,
`date_created` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `submission_id` (`submission_id`),
KEY `account_id` (`account_id`),
KEY `stars` (`stars`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
CREATE TABLE `competition_submissions` (
`competition_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`top_round` int(11) DEFAULT '1',
PRIMARY KEY (`submission_id`),
KEY `competition_id` (`competition_id`),
KEY `top_round` (`top_round`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
state duration (summed) in sec percentage
Copying to tmp table 33.15621 68.46924
Sorting result 11.83148 24.43260
removing tmp table 3.06054 6.32017
Sending data 0.37560 0.77563
... insignificant amounts removed ...
Total 48.42497 100.00000
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE competition_submissions index_merge PRIMARY,competition_id,top_round competition_id,top_round 4,5 18596 Using intersect(competition_id,top_round); Using where; Using index; Using temporary; Using filesort
1 SIMPLE submission_submissions eq_ref PRIMARY PRIMARY 4 inkstakes.competition_submissions.submission_id 1 Using where
1 SIMPLE submission_ratings ALL submission_id 5 Using where; Using join buffer (flat, BNL join)
答案 0 :(得分:1)
假设实际上你不会对未评级的提交感兴趣,并且给定的提交只有给定匹配和top_round的单个competition_submissions
条目,我建议:
SELECT s.*
FROM (SELECT `submission_id`,
AVG(`stars`) AvgStars,
COUNT(`id`) CountId
FROM `submission_ratings`
GROUP BY `submission_id`
ORDER BY AVG(`stars`) DESC, COUNT(`id`) DESC
LIMIT 50) r
JOIN `submission_submissions` s
ON r.`submission_id` = s.`id` AND
s.`date_deleted` IS NULL
JOIN `competition_submissions` c
ON c.`submission_id` = s.`id` AND
c.`top_round` = 1 AND
c.`competition_id` = '2'
ORDER BY r.AvgStars DESC,
r.CountId DESC
(如果给定匹配和top_round的每次提交有多个competition_submissions
条目,则可以将GROUP BY子句添加回主查询。)
如果您确实想要查看未评级的提交,可以将此查询的结果合并到LEFT JOIN ... WHERE NULL查询。
答案 1 :(得分:1)
有一个简单的技巧适用于MySql,有助于避免在这样的查询中复制/排序巨大的临时表(使用LIMIT X)。
只需避免SELECT *
,这会将所有列复制到临时表中,然后对这个巨大的表进行排序,最后,查询只从这个巨大的表中获取50条记录(50/70000 = 0,07%)。
仅选择执行排序和限制所必需的列,然后仅按ID为选定的50条记录添加缺失列。
select ss.*
from submission_submissions ss
join (
SELECT `submission_submissions`.id,
AVG(submission_ratings.`stars`) stars,
COUNT(submission_ratings.`id`) cnt
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
) xx
ON ss.id = xx.id
ORDER BY xx.stars DESC,
xx.cnt DESC;