我意识到数据库查询返回意外的结果会导致我不正当地使用“DISTINCT ON”和“GROUP BY”
我希望有人能指引我直截了当。实际的查询非常复杂,所以我会愚蠢地说:
我有一个表/内部查询,由object_id和时间戳组成:
CREATE TABLE test_select ( object_id INT , event_timestamp timestamp );
COPY test_select (object_id , event_timestamp) FROM stdin (DELIMITER '|');
1 | 2013-01-27 21:01:20
1 | 2012-06-28 14:36:26
1 | 2013-02-21 04:16:48
2 | 2012-06-27 19:53:05
2 | 2013-02-03 17:35:58
3 | 2012-06-14 20:17:00
3 | 2013-02-15 19:03:34
4 | 2012-06-13 13:59:47
4 | 2013-02-23 06:31:16
5 | 2012-07-03 01:45:56
5 | 2012-06-11 21:33:26
\.
我正在尝试选择一个不同的ID,按逆时针上的时间戳排序/重复数据删除
所以结果应该是[4,1,3,2,5]
我认为这就是我所需要的(似乎):
SELECT object_id
FROM test_select
GROUP BY object_id
ORDER BY max(event_timestamp) DESC
;
出于测试/审核的目的,我有时希望包含时间戳字段。我似乎无法弄清楚如何在该查询中包含另一个字段。
有人能指出我上面的sql中的明显问题,或者有关如何包含审核信息的建议吗?
答案 0 :(得分:18)
为了能够选择所有列,而不仅仅是object_id
和MAX(event_timestamp)
,您可以使用DISTINCT ON
SELECT DISTINCT ON (object_id)
object_id, event_timestamp ---, more columns
FROM test_select
ORDER BY object_id, event_timestamp DESC ;
如果您希望按event_timestamp DESC
而不是object_id
排序结果,则需要将其包含在派生表或CTE中:
SELECT *
FROM
( SELECT DISTINCT ON (object_id)
object_id, event_timestamp ---, more columns
FROM test_select
ORDER BY object_id, event_timestamp DESC
) AS t
ORDER BY event_timestamp DESC ;
或者,您可以使用窗口函数,例如ROW_NUMBER()
:
WITH cte AS
( SELECT ROW_NUMBER() OVER (PARTITION BY object_id
ORDER BY event_timestamp DESC)
AS rn,
object_id, event_timestamp ---, more columns
FROM test_select
)
SELECT object_id, event_timestamp ---, more columns
FROM cte
WHERE rn = 1
ORDER BY event_timestamp DESC ;
或使用MAX()
汇总OVER
:
WITH cte AS
( SELECT MAX(event_timestamp) OVER (PARTITION BY object_id)
AS max_event_timestamp,
object_id, event_timestamp ---, more columns
FROM test_select
)
SELECT object_id, event_timestamp ---, more columns
FROM cte
WHERE event_timestamp = max_event_timestamp
ORDER BY event_timestamp DESC ;
答案 1 :(得分:3)
这可能不是处理此问题的最佳方法,但您可以尝试使用窗口函数:
SELECT DISTINCT object_id, MAX(event_timestamp)
OVER (PARTITION BY object_id)
FROM test_select ORDER BY max DESC;
从另一方面它也起作用:
SELECT object_id, MAX(event_timestamp) as max_event_timestamp
FROM test_select
GROUP BY object_id
ORDER BY max_event_timestamp DESC;