返回具有最新时间戳的唯一分组行

时间:2019-08-12 15:16:04

标签: postgresql subquery distinct having

此刻,我正在努力解决一个看起来很简单的问题。

表内容:

Primay Keys: Timestamp, COL_A,COL_B ,COL_C,COL_D        

+------------------+-------+-------+-------+-------+--------+--------+
|    Timestamp     | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:12 |     - | -     |     - |     - |      1 |      2 |
| 31.07.2019 15:32 |     1 | 1     |   100 |     1 |   5000 |     20 |
| 10.08.2019 09:33 |     - | -     |     - |     - |   1000 |      7 |
| 31.07.2019 15:38 |     1 | 1     |   100 |     1 |     33 |      5 |
| 06.08.2019 08:53 |     - | -     |     - |     - |      0 |      7 |
| 06.08.2019 09:08 |     - | -     |     - |     - |      0 |      7 |
| 06.08.2019 16:06 |     3 | 3     |     3 |     3 |      0 |     23 |
| 07.08.2019 10:43 |     - | -     |     - |     - |      0 |     42 |
| 07.08.2019 13:10 |     - | -     |     - |     - |      0 |     24 |
| 08.08.2019 07:19 |    11 | 111   |   111 |    12 |      0 |      2 |
| 08.08.2019 10:54 |  2334 | 65464 |   565 |    76 |   1000 |     19 |
| 08.08.2019 11:15 |   232 | 343   |   343 |    43 |      0 |      2 |
| 08.08.2019 11:30 |  2323 | rtttt |  3434 |    34 |      0 |      2 |
| 10.08.2019 14:47 |     - | -     |     - |     - |    123 |     23 |
+------------------+-------+-------+-------+-------+--------+--------+

所需的查询输出:

+------------------+-------+-------+-------+-------+--------+--------+
|    Timestamp     | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:38 |     1 | 1     |   100 |     1 |     33 |      5 |
| 06.08.2019 16:06 |     3 | 3     |     3 |     3 |      0 |     23 |
| 08.08.2019 07:19 |    11 | 111   |   111 |    12 |      0 |      2 |
| 08.08.2019 10:54 |  2334 | 65464 |   565 |    76 |   1000 |     19 |
| 08.08.2019 11:15 |   232 | 343   |   343 |    43 |      0 |      2 |
| 08.08.2019 11:30 |  2323 | rtttt |  3434 |    34 |      0 |      2 |
| 10.08.2019 14:47 |     - | -     |     - |     - |    123 |     23 |
+------------------+-------+-------+-------+-------+--------+--------+

如您所见,我正在尝试使用最新的时间戳(也是主键)来获取主键的单行。

当前,我尝试了如下查询:

SELECT Timestamp, COL_A, COL_B, COL_C, COL_D, Data_A, Data_B From Table XY op


WHERE Timestamp = (
    SELECT MAX(Timestamp) FROM XY as tsRow
    WHERE op.COL_A = tsRow.COL_A 
    AND op.COL_B = tsRow.COL_B
    AND op.COL_C = tsRow.COL_C 
    AND op.COL_D  = tsRow."COL_D
);

乍一看,这样的结果看起来不错。

有没有更好或更安全的方法来获得我的首选结果?

1 个答案:

答案 0 :(得分:0)

demo:db<>fiddle

您可以使用DISTINCT ON子句,该子句为您提供有序组的第一条记录。您的群组就是您的(A, B, C, D)Timestamp列按降序排列,以使最新记录成为第一记录。

SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
    *
FROM
    mytable
ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC

如果您想获得预期的订单,则需要在此操作之后再输入ORDER BY

SELECT
    *
FROM (
    SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
        *
    FROM
        mytable
    ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC
) s
ORDER BY "Timestamp"

注意:如果您将Timestamp列作为PK的一部分,您确定您确实还需要另外四列作为PK吗?看来,TS列已经是唯一的。