PostgreSQL选择每日最大值和相应的发生小时数

时间:2013-10-01 15:14:58

标签: sql postgresql timestamp greatest-n-per-group

我有以下表结构,每日每小时数据:

time_of_ocurrence(timestamp); particles(numeric)

"2012-11-01 00:30:00";191.3
"2012-11-01 01:30:00";46
 ...
"2013-01-01 02:30:00";319.6

如何选择每日最高时间和最高出现时间? 我试过了

SELECT date_trunc('hour', time_of_ocurrence) as hora,
MAX(particles)
from my_table WHERE time_of_ocurrence > '2013-09-01'
GROUP BY hora ORDER BY hora

但它不起作用:

"2013-09-01 00:00:00";34.35
"2013-09-01 01:00:00";33.13
"2013-09-01 02:00:00";33.09
"2013-09-01 03:00:00";28.08

我的结果将是这种格式(每天最多一次,显示小时)

"2013-09-01 05:00:00";100.35
"2013-09-02 03:30:00";80.13

我该怎么做?谢谢!

2 个答案:

答案 0 :(得分:3)

此类问题经常出现在StackOverflow上,如果您想查看其他解决方案,这些问题会使用标记进行分类。

编辑:我将以下代码更改为按日而不是按小时分组。

这是一个解决方案:

SELECT t.*
FROM (
  SELECT date_trunc('day', time_of_ocurrence) as hora, MAX(particles) AS particles
  FROM my_table
  GROUP BY hora
) AS _max
INNER JOIN my_table AS t 
  ON _max.hora = date_trunc('day', t.time_of_ocurrence)
  AND _max.particles = t.particles
WHERE time_of_ocurrence > '2013-09-01'
ORDER BY time_of_ocurrence;

如果多行有最大值,这也可能每天显示多个结果。

使用不显示重复项的窗口函数的另一种解决方案:

SELECT * FROM (
  SELECT *, 
    ROW_NUMBER() OVER (PARTITION BY date_trunc('day', time_of_ocurrence) 
        ORDER BY particles DESC) AS _rn
  FROM my_table
) AS _max
WHERE _rn = 1
ORDER BY time_of_ocurrence;

如果多行具有相同的最大值,则一行仍然编号为行1.如果需要对哪一行编号为1进行特定控制,则需要在分区子句中使用ORDER BY,使用唯一列来打破此类关联

答案 1 :(得分:2)

使用window functions

select distinct
  date_trunc('day',time_of_ocurrence) as day,
  max(particles) over (partition by date_trunc('day',time_of_ocurrence)) as particles_max_of_day,
  first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc)
from my_table
order by 1

这里的一个边缘情况是,如果相同的MAX数量的粒子在同一天出现,但在不同的时间出现。这个版本将随机选择其中一个。如果您喜欢一个而不是另一个(例如,总是较早的那个),您可以将它添加到order by子句中:

      first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc, time_of_ocurrence)