GROUP BY + HAVING忽略行

时间:2017-12-12 21:06:20

标签: mysql

基本上我想要的是我可以选择记录保持者和最佳时间的所有比赛记录。我查找了类似的查询,并设法找到比其他查询更快的3个查询。

问题是它完全忽略了用户ID 2拥有记录的种族。

这些是我的表,索引和一些示例数据:

CREATE TABLE `races` (
 `raceid` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
 `name` varchar(20) NOT NULL,
 PRIMARY KEY (`raceid`),
 UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `users` (
 `userid` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
 `name` varchar(20) NOT NULL,
 PRIMARY KEY (`userid`),
 UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `race_times` (
 `raceid` smallint(5) unsigned NOT NULL,
 `userid` mediumint(8) unsigned NOT NULL,
 `time` mediumint(8) unsigned NOT NULL,
 PRIMARY KEY (`raceid`,`userid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `races` (`raceid`, `name`) VALUES
(1, 'Doherty'),
(3, 'Easter Basin Naval S'),
(5, 'Flint County'),
(6, 'Fort Carson'),
(4, 'Glen Park'),
(2, 'Palomino Creek'),
(7, 'Tierra Robada');

INSERT INTO `users` (`userid`, `name`) VALUES
(1, 'Player 1'),
(2, 'Player 2');

INSERT INTO `race_times` (`raceid`, `userid`, `time`) VALUES
(1, 1, 51637),
(1, 2, 50000),
(2, 1, 148039),
(3, 1, 120516),
(3, 2, 124773),
(4, 1, 101109),
(6, 1, 89092),
(6, 2, 89557),
(7, 1, 77933),
(7, 2, 78038);

所以如果我运行这两个查询:

SELECT rt1.raceid, r.name, rt1.userid, p.name, rt1.time
FROM race_times rt1
LEFT JOIN users p ON (rt1.userid = p.userid)
JOIN races r ON (r.raceid = rt1.raceid)
WHERE rt1.time = (SELECT MIN(rt2.time) FROM race_times rt2 WHERE rt1.raceid = rt2.raceid)
GROUP BY r.name;

或..

SELECT rt1.*, r.name, p.name
FROM race_times rt1
LEFT JOIN users p ON p.userid = rt1.userid
JOIN races r ON r.raceid = rt1.raceid
WHERE EXISTS (SELECT NULL FROM race_times rt2 WHERE rt2.raceid = rt1.raceid
GROUP BY rt2.raceid HAVING MIN(rt2.time) >= rt1.time);

我收到了正确的结果,如下所示:

raceid | name                 | userid | name     | time   |
-------+----------------------+--------+----------+--------|
1      | Doherty              | 2      | Player 2 | 50000  |
3      | Easter Basin Naval S | 1      | Player 1 | 120516 |
6      | Fort Carson          | 1      | Player 1 | 89092  |
4      | Glen Park            | 1      | Player 1 | 101109 |
2      | Palomino Creek       | 1      | Player 1 | 148039 |
7      | Tierra Robada        | 1      | Player 1 | 77933  |

这是错误的查询:

SELECT rt.raceid, r.name, rt.userid, p.name, rt.time
FROM race_times rt
LEFT JOIN users p ON p.userid = rt.userid
JOIN races r ON r.raceid = rt.raceid
GROUP BY r.name
HAVING rt.time = MIN(rt.time);

结果如下:

raceid | name                 | userid | name     | time   |
-------+----------------------+--------+----------+--------|
3      | Easter Basin Naval S | 1      | Player 1 | 120516 |
6      | Fort Carson          | 1      | Player 1 | 89092  |
4      | Glen Park            | 1      | Player 1 | 101109 |
2      | Palomino Creek       | 1      | Player 1 | 148039 |
7      | Tierra Robada        | 1      | Player 1 | 77933  |

正如你所看到的,比赛" Doherty" (raceid:1)归" Player 2" (用户ID:2)并且它不会与其余的比赛记录一起显示(它们都由用户标识1拥有)。有什么问题?

此致

1 个答案:

答案 0 :(得分:0)

拥有后置过滤器。查询获取所有结果,然后根据具体进一步过滤它们。 GROUP BY基于组压缩行,这为您提供了每组中的第一个条目。由于玩家1是第一场比赛的第一个参赛者,因此这是HAVING正在处理的结果。然后将其过滤掉,因为它的时间不等于组结果的MIN(时间)。

这就是您发布的其他人正在使用子查询的原因。我个人的偏好是第一个例子,对我而言,它更容易阅读。表现明智,他们应该是一样的。

虽然在where子句中尝试避免子查询并不是一个坏主意,但是当你可以用JOIN完成相同的结果时,这通常是有效的。其他时候,无法通过JOIN获得结果,并且需要子查询。