使用GROUP BY跳过条目的MySQL MAX值

时间:2014-08-27 14:48:29

标签: mysql sql greatest-n-per-group

我在一个名为entries

的sql表中有一些数据
username    location    species date    length  weight  timestamp   id
    BooF    Black Lake  Smallmouth Bass 2014-08-12  12  1   2014-08-04 12:58:00 1
    BooF    Black Lake  Largemouth Bass 2014-08-13  15  2   2014-08-04 12:58:00 2
    BooF    Black Lake  Largemouth Bass 2014-08-19  20  5   2014-08-04 12:58:00 3
    BooF    Lake Bonaparte  Smallmouth Bass 2014-08-13  13  1   2014-08-04 12:58:00 4
    BooF    Lake Bonaparte  Largemouth Bass 2014-08-28  14  2   2014-08-04 12:58:00 5
    BooF    Black Lake  Largemouth Bass 2014-08-27  18  3   2014-08-04 13:22:03 6
    BooF    Lake Bonaparte  Smallmouth Bass 2014-08-19  14.3    3.4 2014-08-05 16:58:47 8
    BooF    Cranberry Lake  Walleye 2014-08-05  10  1   2014-08-18 17:14:00 10
    BooF    Cranberry Lake  Walleye 2014-08-05  10  1   2014-08-18 17:16:28 11
            Indian Lake Walleye 2014-08-05  10  1   2014-08-18 17:30:14 13
    BooF    Indian Lake Walleye 2014-08-05  10  1   2014-08-18 17:34:38 14
    BooF    Crystal Lake    Walleye 2014-08-06  10  4   2014-08-18 17:35:29 15
    BooF    Hudson River    Walleye 2014-08-11  10  2   2014-08-19 15:29:19 16
    BooF    Indian River    Northern Pike   2014-08-05  20  2   2014-08-26 09:46:03 17
            Hudson River    Smallmouth Bass 2014-08-05  12  1   2014-08-26 09:47:14 18
    BooF    Hyde Lake   Pickerel    2014-08-06  20  2   2014-08-26 09:48:24 20
--> BooF    Lake Ozonia Walleye 2014-08-14  20  3   2014-08-26 10:10:59 23
--> BooF    Mud Lake    Walleye 2014-08-14  21  2   2014-08-26 10:10:59 24
    Daswabbage  Lake Ontario    White Crappie   2014-08-12  15  20  2014-08-26 12:25:00 26
    Daswabbage  Lake Ontario    White Crappie   2014-08-06  16  21  2014-08-26 12:25:49 27
    Daswabbage  Butterfield Lake    Black Crappie   2014-08-13  5   2   2014-08-26 12:27:00 28
    Daswabbage  Black River Smallmouth Bass 2014-08-12  12  2   2014-08-26 12:28:09 29
    Daswabbage  Cranberry Lake  Smallmouth Bass 2014-08-20  5   5   2014-08-26 12:34:10 30
    Daswabbage  Clear Lake  Smallmouth Bass 2014-08-05  3   6   2014-08-26 12:41:52 31
    Daswabbage  Clear Lake  Smallmouth Bass 2014-08-06  10  7   2014-08-26 13:00:48 32
    BooF    Cranberry Lake  Pickerel    2014-08-07  15  5   2014-08-26 15:13:45 34
    BooF    Cranberry Lake  Pickerel    2014-08-02  13  6   2014-08-26 15:15:08 35
    BooF    Butterfield Lake    White Crappie   2014-08-18  10  26  2014-08-26 15:15:42 36
--> BooF    Lake Ozonia Walleye 2014-08-31  9   5   2014-08-26 15:17:18 37
--> BooF    Grass Lake  White Crappie   2014-08-11  15  30  2014-08-26 15:18:52 38
--> BooF    Grass Lake  White Crappie   2014-08-20  15  30  2014-08-26 16:06:44 39
--> BooF    Crystal Lake    White Crappie   2014-08-20  6   10  2014-08-26 16:59:32 43

我正在尝试为特定用户收集每个物种的MAX长度。我试过了

"SELECT length.* FROM entries length 
    INNER JOIN (SELECT species, MAX(length) AS  MaxLength 
    FROM entries WHERE username = 'BooF' GROUP BY species) 
    groupedlength ON   
    length.species = groupedlength.species AND length.length = groupedlength.MaxLength 
    ORDER BY species"

我也试过

SELECT * FROM (SELECT * FROM entries ORDER BY length DESC) tmp 
    WHERE username='BooF' GROUP BY species

这两种方式似乎都会产生相同的结果

username    location    species date    length  weight  timestamp   id
    BooF    Black Lake  Largemouth Bass 2014-08-19  20  5   2014-08-04 12:58:00 3
    BooF    Indian River    Northern Pike   2014-08-05  20  2   2014-08-26 09:46:03 17
    BooF    Hyde Lake   Pickerel    2014-08-06  20  2   2014-08-26 09:48:24 20
    BooF    Lake Bonaparte  Smallmouth Bass 2014-08-19  14.3    3.4 2014-08-05 16:58:47 8
    BooF    Lake Ozonia Walleye 2014-08-31  9   5   2014-08-26 15:17:18 37
    BooF    Crystal Lake    White Crappie   2014-08-20  6   10  2014-08-26 16:59:32 43

如果你可以通过我的乱七八糟的混乱看到这对于角膜白斑和白色克拉皮他们错了。长度为9,长度为6,显然不是我原始数据中这些鱼的最大长度。我相信我正在正确地执行我的功能,但我不清楚他们为什么跳过比输出数据更高的数字。谢谢你提前。

2 个答案:

答案 0 :(得分:1)

如果您只需要Max length,那就是一个简单的聚合:

SELECT MAX(length), species FROM entries WHERE username = 'BooF' group by species

如果你需要与max-record相关的表的其他列,它也会变得更加棘手。 请注意,只是添加所需的列(或选择*而不进行聚合)将无法提供正确的结果,因为您缺少这些列上的聚合。 (MsSQL显然会抛出一个错误,mysql会为你不应用聚合的列返回一些未定义的内容)

假设您希望将其他列与相关记录相关联,您可以执行以下操作:

  • 自己加入表格
  • 在连接条件中添加一个比较<,以确保您获得每个表的最大条目。
  • 然后选择RIGHT结果集isnull中的每个结果 - 导致其中一个不匹配,即LARGER - &gt;它是最大的结果行。

喜欢:

SELECT
  `left`.*
FROM
  entries `left`
LEFT JOIN
  entries `right`
ON
  `left`.species = `right`.species -- only compare the same species
  AND `left`.username = `right`.username -- only compare  for the same user
  AND `left`.length < `right`.length -- smaller result on the left side.
WHERE
  ISNULL(`right`.id); -- choose the one that has no larger match.
  AND `left`.username = 'BooF' -- just for BooF.

ps。:leftright是哑表别名,因为它们是保留关键字:)

答案 1 :(得分:0)

我只能通过使用length的字符串数据类型来重现您的奇怪结果,因为字符串&#34; 10&#34; lexically 小于字符串&#34; 9&#34;或&#34; 6&#34;。因此,当两个数字具有不同的位数时,最大字符串值不一定是数值上的最大值。

mysql> SELECT 10 < 9;
+--------+
| 10 < 9 |
+--------+
|      0 |
+--------+

mysql> SELECT '10' < '9';
+------------+
| '10' < '9' |
+------------+
|          1 |
+------------+

您应将length存储为DECIMAL(9,1)数据类型,而不是字符串。

另一种解决方案是在计算MAX()时将长度值强制转换为数值:

SELECT length.* FROM entries length 
INNER JOIN (SELECT species, MAX(CAST(length AS DECIMAL(9,1))) AS  MaxLength 
    FROM entries WHERE username = 'BooF' GROUP BY species) AS groupedlength 
ON length.species = groupedlength.species AND length.length = groupedlength.MaxLength 
WHERE length.username = 'BooF'
ORDER BY species

在我的测试中,它正确地找到了与最大值匹配的条目:

+----+----------+----------------+-----------------+------------+--------+--------+---------------------+
| id | username | location       | species         | date       | length | weight | timestamp           |
+----+----------+----------------+-----------------+------------+--------+--------+---------------------+
|  3 | BooF     | Black Lake     | Largemouth Bass | 2014-08-19 | 20     | 5      | 2014-08-04 12:58:00 |
| 17 | BooF     | Indian River   | Northern Pike   | 2014-08-05 | 20     | 2      | 2014-08-26 09:46:03 |
| 20 | BooF     | Hyde Lake      | Pickerel        | 2014-08-06 | 20     | 2      | 2014-08-26 09:48:24 |
|  8 | BooF     | Lake Bonaparte | Smallmouth Bass | 2014-08-19 | 14.3   | 3.4    | 2014-08-05 16:58:47 |
| 24 | BooF     | Mud Lake       | Walleye         | 2014-08-14 | 21     | 2      | 2014-08-26 10:10:59 |
| 38 | BooF     | Grass Lake     | White Crappie   | 2014-08-11 | 15     | 30     | 2014-08-26 15:18:52 |
| 39 | BooF     | Grass Lake     | White Crappie   | 2014-08-20 | 15     | 30     | 2014-08-26 16:06:44 |
+----+----------+----------------+-----------------+------------+--------+--------+---------------------+

但是,您可能会注意到它找到了匹配最大值的任何条目。所以White Crappie有两行,因为两者都是并列的,都匹配最大值。我不知道你是否打算这样做。