通过avg第三个表对mysql连接的排序结果?

时间:2013-08-11 10:22:43

标签: mysql sorting join

我有三张桌子。

一个表包含大约75,000行的提交 一个表包含提交评级,并且仅具有< 10排
一个表包含submission =>竞争映射和我的测试数据也有大约75,000行。

我想做的是

  

在一轮比赛中获得前50名提交。   Top被列为最高平均评分,其次是最高票数

以下是我使用的查询有效,但问题是需要45秒才能完成!我分析了查询(结果在底部),瓶颈是将数据复制到tmp表然后对其进行排序,以便如何加快速度呢?

 SELECT `submission_submissions`.* 
   FROM `submission_submissions`
   JOIN `competition_submissions` 
     ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings` 
     ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
  WHERE `top_round` =  1 
    AND `competition_id` =  '2'
    AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC, 
         COUNT(submission_ratings.`id`) DESC
  LIMIT 50

submission_submissions

CREATE TABLE `submission_submissions` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) NOT NULL,
  `title` varchar(255) NOT NULL,
  `description` varchar(255) DEFAULT NULL,
  `genre` int(11) NOT NULL,
  `goals` text,
  `submission` text NOT NULL,
  `date_created` datetime DEFAULT NULL,
  `date_modified` datetime DEFAULT NULL,
  `date_deleted` datetime DEFAULT NULL,
  `cover_image` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `genre` (`genre`),
  KEY `account_id` (`account_id`),
  KEY `date_created` (`date_created`)
) ENGINE=InnoDB AUTO_INCREMENT=115037 DEFAULT CHARSET=latin1;

submission_ratings

CREATE TABLE `submission_ratings` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) NOT NULL,
  `submission_id` int(11) NOT NULL,
  `stars` tinyint(1) NOT NULL,
  `date_created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `submission_id` (`submission_id`),
  KEY `account_id` (`account_id`),
  KEY `stars` (`stars`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;

competition_submissions

CREATE TABLE `competition_submissions` (
  `competition_id` int(11) NOT NULL,
  `submission_id` int(11) NOT NULL,
  `top_round` int(11) DEFAULT '1',
  PRIMARY KEY (`submission_id`),
  KEY `competition_id` (`competition_id`),
  KEY `top_round` (`top_round`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

SHOW PROFILE结果(按持续时间排序)

state                 duration (summed) in sec percentage
Copying to tmp table  33.15621                 68.46924
Sorting result        11.83148                 24.43260
removing tmp table     3.06054                  6.32017
Sending data           0.37560                  0.77563
... insignificant amounts removed ...
Total                  48.42497               100.00000

EXPLAIN

id  select_type  table                    type         possible_keys                     key                       key_len  ref                                              rows   Extra                                                                                                 
1   SIMPLE       competition_submissions  index_merge  PRIMARY,competition_id,top_round  competition_id,top_round  4,5                                                       18596  Using intersect(competition_id,top_round); Using where; Using index; Using temporary; Using filesort  
1   SIMPLE       submission_submissions   eq_ref       PRIMARY                           PRIMARY                   4        inkstakes.competition_submissions.submission_id  1      Using where                                                                                           
1   SIMPLE       submission_ratings       ALL          submission_id                                                                                                         5      Using where; Using join buffer (flat, BNL join)                                                       

2 个答案:

答案 0 :(得分:1)

假设实际上你不会对未评级的提交感兴趣,并且给定的提交只有给定匹配和top_round的单个competition_submissions条目,我建议:

SELECT s.* 
FROM (SELECT `submission_id`, 
             AVG(`stars`) AvgStars, 
             COUNT(`id`) CountId
      FROM `submission_ratings` 
      GROUP BY `submission_id`
      ORDER BY AVG(`stars`) DESC, COUNT(`id`) DESC
      LIMIT 50) r
JOIN `submission_submissions` s
  ON r.`submission_id` = s.`id` AND
     s.`date_deleted` IS NULL
JOIN `competition_submissions` c
  ON c.`submission_id` = s.`id` AND 
     c.`top_round` =  1 AND
     c.`competition_id` = '2'
ORDER BY r.AvgStars DESC, 
         r.CountId DESC

(如果给定匹配和top_round的每次提交有多个competition_submissions条目,则可以将GROUP BY子句添加回主查询。)

如果您确实想要查看未评级的提交,可以将此查询的结果合并到LEFT JOIN ... WHERE NULL查询。

答案 1 :(得分:1)

有一个简单的技巧适用于MySql,有助于避免在这样的查询中复制/排序巨大的临时表(使用LIMIT X)。

只需避免SELECT *,这会将所有列复制到临时表中,然后对这个巨大的表进行排序,最后,查询只从这个巨大的表中获取50条记录(50/70000 = 0,07%)。

仅选择执行排序和限制所必需的列,然后仅按ID为选定的50条记录添加缺失列。

select ss.*
from submission_submissions ss
join (
            SELECT `submission_submissions`.id, 
                    AVG(submission_ratings.`stars`) stars,
                    COUNT(submission_ratings.`id`) cnt
               FROM `submission_submissions`
               JOIN `competition_submissions` 
                 ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
            LEFT JOIN `submission_ratings` 
                 ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
              WHERE `top_round` =  1 
                AND `competition_id` =  '2'
                AND `submission_submissions`.`date_deleted` IS NULL
            GROUP BY submission_submissions.id
            ORDER BY AVG(submission_ratings.`stars`) DESC, 
                     COUNT(submission_ratings.`id`) DESC
              LIMIT 50
) xx
ON ss.id = xx.id
ORDER BY xx.stars DESC, 
         xx.cnt DESC;