Question

我有以下表结构，每日每小时数据：

time_of_ocurrence(timestamp); particles(numeric)

"2012-11-01 00:30:00";191.3
"2012-11-01 01:30:00";46
 ...
"2013-01-01 02:30:00";319.6

如何选择每日最高时间和最高出现时间？我试过了

SELECT date_trunc('hour', time_of_ocurrence) as hora,
MAX(particles)
from my_table WHERE time_of_ocurrence > '2013-09-01'
GROUP BY hora ORDER BY hora

但它不起作用：

"2013-09-01 00:00:00";34.35
"2013-09-01 01:00:00";33.13
"2013-09-01 02:00:00";33.09
"2013-09-01 03:00:00";28.08

我的结果将是这种格式（每天最多一次，显示小时）

"2013-09-01 05:00:00";100.35
"2013-09-02 03:30:00";80.13

我该怎么做？谢谢！

Answer 1

此类问题经常出现在StackOverflow上，如果您想查看其他解决方案，这些问题会使用greatest-n-per-group标记进行分类。

编辑：我将以下代码更改为按日而不是按小时分组。

这是一个解决方案：

SELECT t.*
FROM (
  SELECT date_trunc('day', time_of_ocurrence) as hora, MAX(particles) AS particles
  FROM my_table
  GROUP BY hora
) AS _max
INNER JOIN my_table AS t 
  ON _max.hora = date_trunc('day', t.time_of_ocurrence)
  AND _max.particles = t.particles
WHERE time_of_ocurrence > '2013-09-01'
ORDER BY time_of_ocurrence;

如果多行有最大值，这也可能每天显示多个结果。

使用不显示重复项的窗口函数的另一种解决方案：

SELECT * FROM (
  SELECT *, 
    ROW_NUMBER() OVER (PARTITION BY date_trunc('day', time_of_ocurrence) 
        ORDER BY particles DESC) AS _rn
  FROM my_table
) AS _max
WHERE _rn = 1
ORDER BY time_of_ocurrence;

如果多行具有相同的最大值，则一行仍然编号为行1.如果需要对哪一行编号为1进行特定控制，则需要在分区子句中使用ORDER BY，使用唯一列来打破此类关联

Answer 2

使用window functions：

select distinct
  date_trunc('day',time_of_ocurrence) as day,
  max(particles) over (partition by date_trunc('day',time_of_ocurrence)) as particles_max_of_day,
  first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc)
from my_table
order by 1

这里的一个边缘情况是，如果相同的MAX数量的粒子在同一天出现，但在不同的时间出现。这个版本将随机选择其中一个。如果您喜欢一个而不是另一个（例如，总是较早的那个），您可以将它添加到order by子句中：

      first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc, time_of_ocurrence)

PostgreSQL选择每日最大值和相应的发生小时数

2 个答案: