MySQL子查询为每个组选择第一行

时间:2019-01-25 21:00:45

标签: mysql greatest-n-per-group

我需要创建一个MySQL存储过程,该过程将选择他们获得的所有User中的每个SUM Points

查询应将GameStartTime分组,并且仅选择按Points排序的每个组的第一行。我试图忽略每个StartTime的重复User值,但仍保留第一个。如果User保存两次相同的游戏,这应该避免作弊。

如果User不在任何Game中,它仍应返回NULL

CREATE PROCEDURE `spGetPoints`(
    IN _StartDate DATETIME,
    IN _EndDate DATETIME,
    IN _Limit INT,
    IN _Offset INT
)
BEGIN
    SELECT `User`.`UserId`, `User`.`Username`,
        (SELECT SUM(`Game`.`Points`)
            FROM `Game`
            WHERE `Game`.`UserId` = `User`.`UserId` AND
            `Game`.`StartDate` > _StartDate AND `Game`.`StartDate` < _EndDate
            GROUP BY `Game`.`StartDate`
            ORDER BY `Game`.`Points` DESC
            LIMIT 1
        ) AS `Value`
    FROM `User`
    ORDER BY `Value` DESC, `User`.`Username` ASC
    LIMIT _Limit OFFSET _Offset;
END

示例用户表

+--------+----------+
| UserId | Username |
+--------+----------+
|      1 | JaneDoe  |
|      2 | JohnDoe  |
+--------+----------+

示例游戏桌

+--------+--------+-------------------------+--------+
| GameId | UserId |        StartDate        | Points |
+--------+--------+-------------------------+--------+
|      1 |      1 | 2019-01-09  12:43:00 AM |   1789 |
|      2 |      1 | 2019-01-09  11:35:00 AM |   1048 |
|      3 |      1 | 2019-01-09  9:22:00 AM  |    900 |
|      4 |      1 | 2019-01-09  12:43:00 AM |   1789 |
|      5 |      1 | 2019-01-09  11:35:00 AM |   1048 |
|      6 |      1 | 2019-01-09  9:22:00 AM  |    900 |
|      7 |      1 | 2019-01-09  12:43:00 AM |   1789 |
|      8 |      1 | 2019-01-09  11:35:00 AM |   1048 |
|      9 |      2 | 2019-01-17  12:05:00 AM |    552 |
|     10 |      2 | 2019-01-24  12:08:00 AM |    512 |
|     11 |      2 | 2019-01-27  5:13:00 PM  |      0 |
+--------+--------+-------------------------+--------+

当前结果

+--------+----------+-------+
| UserId | Username | Value |
+--------+----------+-------+
|      1 | JaneDoe  |  5367 |
|      2 | JohnDoe  |   552 |
+--------+----------+-------+

预期结果

+--------+----------+-------+
| UserId | Username | Value |
+--------+----------+-------+
|      1 | JaneDoe  |  3737 |
|      2 | JohnDoe  |  1064 |
+--------+----------+-------+

通过从子查询中选择SUM并对UserId进行硬编码,我可以通过以下语句获得预期的结果。

SELECT SUM(`x`.`Points`) FROM
(SELECT `Points`
    FROM `Game`
    WHERE `Game`.`UserId` = 1 AND
    `Game`.`StartDate` > STR_TO_DATE('01/09/2019', '%m/%d/%Y') AND `Game`.`StartDate` < STR_TO_DATE('02/09/2019', '%m/%d/%Y')
    GROUP BY `Game`.`StartDate`
    ORDER BY `Game`.`Points` ASC) AS `x`;

当我尝试将该语句放入下面的语句中的子查询中时,收到此错误消息Error Code: 1054. Unknown column 'User.UserId' in 'where clause'。我收到此错误是因为UserId在第二个子查询中不可见。

SELECT `User`.`UserId`, `User`.`Username`,
        (SELECT SUM(`x`.`Points`) FROM (SELECT `Game`.`Points`
            FROM `Game`
            WHERE `Game`.`UserId` = `User`.`UserId` AND
            `Game`.`StartDate` > STR_TO_DATE('01/09/2019', '%m/%d/%Y') AND `Game`.`StartDate` < STR_TO_DATE('02/09/2019', '%m/%d/%Y')
            GROUP BY `Game`.`StartDate`
            ORDER BY `Game`.`Points` DESC) AS `x`
        ) AS `Value`
    FROM `User`
    ORDER BY `Value` DESC, `User`.`Username` ASC;

1 个答案:

答案 0 :(得分:0)

我将查询更改为在LEFT JOIN上使用Game。我还添加了GROUP BY 'Game'.'UserId', 'Game'.'StartDate'GROUP BY 'User'.'UserId'

CREATE PROCEDURE `spGetPoints`(
    IN _StartDate DATETIME,
    IN _EndDate DATETIME,
    IN _Limit INT,
    IN _Offset INT
)
BEGIN
    SELECT `User`.`UserId`, `User`.`Username`,
        SUM(`Game`.`Points`) AS `Value`
        FROM `User`
        LEFT JOIN (SELECT *
            FROM `Game` 
            WHERE `Game`.`StartDate` > _StartDate AND `Game`.`StartDate` < _EndDate
            GROUP BY `Game`.`UserId`, `Game`.`StartDate`
            ORDER BY `Game`.`Points`
        ) AS `Game` ON `User`.`UserId` = `Game`.`UserId`
        GROUP BY `User`.`UserId`
        ORDER BY `Value` DESC, `User`.`Username` ASC
        LIMIT _Limit OFFSET _Offset;
END

此链接也有帮助。 Select first row in each GROUP BY group?