一列上的SQL聚合给出另一列的结果

时间:2019-02-13 20:15:15

标签: sql sqlite tsql

我正在尝试(但失败)联接SQLite数据库中的某些表。数据本身很复杂,但我想将其简化为一个示例。

这是我要加入的三个表。

表格:事件

+----+---------+-------+-----------+
| id | user_id | class | timestamp |
+----+---------+-------+-----------+
|  1 | 'user1' |     6 |       100 |
|  2 | 'user1' |    12 |       400 |
|  3 | 'user1' |     4 |       900 |
|  4 | 'user2' |     6 |       400 |
|  5 | 'user2' |     3 |       800 |
|  6 | 'user2' |     8 |       900 |
+----+---------+-------+-----------+

表格:游戏

+---------+---------+------------+-----------+
| user_id | game_id | game_class | timestamp |
+---------+---------+------------+-----------+
| 'user1' |       1 | 'A'        |       200 |
| 'user2' |       2 | 'A'        |       300 |
| 'user1' |       3 | 'B'        |       500 |
| 'user1' |       4 | 'A'        |       600 |
| 'user1' |       5 | 'A'        |       700 |
+---------+---------+------------+-----------+

表格:AScores

+---------+-------+
| game_id | score |
+---------+-------+
|       1 |     8 |
|       2 |     2 |
|       4 |     9 |
|       5 |     6 |
+---------+-------+

我想加入这些内容,以便在第一张桌子上增加一列,其中包含事件发生时用户在游戏类A中的当前得分。即我希望加入的结果看起来像这样:

所需结果

+----+----------+-------+-----------+-----------------+
| id | user_id  | class | timestamp | current_a_score |
+----+----------+-------+-----------+-----------------+
|  1 |  'user1' |     6 |       100 | (null)          |
|  2 |  'user1' |    12 |       400 | 8               |
|  3 |  'user1' |     4 |       900 | 6               |
|  4 |  'user2' |     6 |       400 | 2               |
|  5 |  'user2' |     3 |       800 | 2               |
|  6 |  'user2' |     8 |       900 | 2               |
+----+----------+-------+-----------+-----------------+

以下简单的联接将两个表AScores和Games组合在一起。

SELECT * FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id

因此,我希望将其作为子查询加入到“事件”表中。像这样:

SELECT Events.*, AScoredGames.time_stamp AS game_time_stamp, AScoredGames.score
FROM Events
LEFT OUTER JOIN (
    SELECT AScores.score, Games.* FROM AScores
    INNER JOIN Games
    ON AScores.game_id = Games.game_id
) AS AScoredGames
ON Events.user_id = AScoredGames.user_id 
AND Events.time_stamp >= AScoredGames.time_stamp
ORDER BY Events.time_stamp ASC

结果如下:

+----+---------+-------+------------+-----------------+-------+
| id | user_id | class | time_stamp | game_time_stamp | score |
+----+---------+-------+------------+-----------------+-------+
|  1 | user1   |     6 | 100        | NULL            | NULL  |
|  2 | user1   |    12 | 400        | 200             | 8     |
|  4 | user2   |     6 | 400        | 300             | 2     |
|  5 | user2   |     3 | 800        | 300             | 2     |
|  6 | user2   |     8 | 900        | 300             | 2     |
|  3 | user1   |     4 | 900        | 200             | 8     |
|  3 | user1   |     4 | 900        | 600             | 9     |
|  3 | user1   |     4 | 900        | 700             | 6     |
+----+---------+-------+------------+-----------------+-------+

因此,我需要按Events.id分组,以除去带有Events.id 3的三重行。但是我要做的是选择具有最大game_time_stamp的行,然后使用该行的分数。如果我做MAX(game_time_stamp)作为汇总,我仍然必须独立汇总分数。有没有办法将score列的聚合函数中的行选择与game_time_stamp列的聚合函数的结果联系起来?

(NB对于Select first record in a One-to-Many relation using left joinSQL Server: How to Join to first row之类的问题的现有答案似乎表明我不能,并且说必须在子查询上使用WHERE子句。但是我为此感到苦恼(我会发表有关此问题的另一个问题),我可以想到至少一个解决方案,我希望有更好的解决方案。)

2 个答案:

答案 0 :(得分:1)

以下查询应执行此操作。它使用NOT EXISTS条件和相关子查询来查找每个事件的相关游戏记录。

SELECT e.*, s.score current_a_score
FROM 
    events e
    LEFT JOIN games g 
        ON  g.user_id = e .user_id
        AND g.timestamp < e.timestamp
        AND NOT EXISTS (
            SELECT 1 
            FROM games g1
            WHERE 
                g1.user_id = e .user_id
                AND g1.timestamp < e.timestamp 
                AND g1.timestamp > g.timestamp
        )
    LEFT JOIN ascores s 
        ON  s.game_id = g.game_id
ORDER BY e.id

DB Fiddle demo 与您的测试数据一起返回:

| id  | user_id | class | timestamp | current_a_score |
| --- | ------- | ----- | --------- | --------------- |
| 1   | user1   | 6     | 100       |                 |
| 2   | user1   | 12    | 400       | 8               |
| 3   | user1   | 4     | 900       | 6               |
| 4   | user2   | 6     | 400       | 2               |
| 5   | user2   | 3     | 800       | 2               |
| 6   | user2   | 8     | 900       | 2               |

答案 1 :(得分:0)

我有一个解决方法,但是感觉很不可靠,并且依赖于我的数据的细节。首先请注意,time_stamps均为100的倍数,而分数均低于10。我可以以不会干扰我的比较的方式将它们组合在一起,但这意味着它们都被编码在一个数字列中。该查询提供了所需的结果:

SELECT Events.id, MIN(Events.user_id) AS user_id, MIN(Events.class) AS class, MIN(Events.time_stamp) AS time_stamp, MAX(AScoredGames.combination) % 10 AS current_a_score
FROM Events
LEFT OUTER JOIN (
        SELECT AScores.score, AScores.score + (Games.time_stamp - 10) AS combination, Games.* FROM AScores
        INNER JOIN Games
        ON AScores.game_id = Games.game_id) AS AScoredGames
ON Events.user_id = AScoredGames.user_id AND Events.time_stamp >= AScoredGames.time_stamp
GROUP BY Events.id
ORDER BY id ASC

(合并在AScores.score + (Games.time_stamp - 10)中完成,因此聚合函数变为MAX(AScoredGames.combination) % 10。)

实际结果

+----+---------+-------+------------+-----------------+
| id | user_id | class | time_stamp | current_a_score |
+----+---------+-------+------------+-----------------+
|  1 | user1   |     6 |        100 | NULL            |
|  2 | user1   |    12 |        400 | 8               |
|  3 | user1   |     4 |        900 | 6               |
|  4 | user2   |     6 |        400 | 2               |
|  5 | user2   |     3 |        800 | 2               |
|  6 | user2   |     8 |        900 | 2               |
+----+---------+-------+------------+-----------------+