我正在尝试按项目ID列分组的设定时间段生成滚动平均值。
这是表格的基本布局和一些虚拟数据,剥离了绒毛:
----------------------------------------------------
| id | itemid | isup | logged |
----------------------------------------------------
| 1 | 1 | true | 2017-03-23 12:55:00 |
| 2 | 1 | false | 2017-03-23 12:57:00 |
| 3 | 1 | true | 2017-03-23 13:07:00 |
| 4 | 1 | false | 2017-03-23 13:09:00 |
| 5 | 1 | true | 2017-03-23 13:50:00 |
| 6 | 2 | false | 2017-03-23 12:55:00 |
| 7 | 2 | true | 2017-03-23 14:00:00 |
| 8 | 2 | false | 2017-03-23 14:03:00 |
----------------------------------------------------
我找到了answer to a previous question on rolling averages,但我似乎无法弄清楚如何按项目ID对平均值进行分组;几乎所有我失败的途径最终导致统计数字出错了。
这是我的出发点 - 我感觉我对ROW_NUMBER()OVER缺乏了解并没有帮助。
SELECT id, itemid, AVG(isup)
OVER (PARTITION BY groupnr ORDER BY logged) AS averagehour
FROM (
SELECT id, itemid, isup, logged, intervalgroup,
itemid - ROW_NUMBER() OVER (
partition by intervalgroup ORDER BY logged) AS groupnr
FROM (
SELECT id, itemid, logged,
CASE WHEN isup = TRUE THEN 1 ELSE 0 END AS isup,
'epoch'::TIMESTAMP + '3600 seconds'::INTERVAL *
(EXTRACT(EPOCH FROM logged)::INT4 / 3600) AS intervalgroup
FROM uplog
) alias_inner
) alias_outer
ORDER BY logged;
非常感谢任何帮助。
答案 0 :(得分:0)
我的回答是
array([[ 0.40929448, 0.47071505, 0.27701891],
[ 0.59383913, 0.60611158, 0.55329837],
[ 0.4393785 , 0.4276561 , 0.34999225],
[ 0.4159481 , 0.4516056 , 0.3026519 ],
[ 0.54449997, 0.36963636, 0.4001209 ],
[ 0.36970012, 0.3145826 , 0.315974 ]])
是logged
,这是唯一合理的数据记录类型。
您的复杂日期算术假设在时区UTC计算timestamp with time zone
的值(否则您为什么要使用logged
作为基数?),舍入到下一个较低的小时
您希望按该舍入时间戳和'epoch'::timestamp
进行分组。
这是一个答案:
itemid
答案 1 :(得分:0)
链接的答案几乎包含您需要的一切。如果你想进一步“分组”(f.ex. by echo get_post_meta($post->ID, 'featured_image', true);
),你只需要将这些“组”添加到窗口函数的itemid
子句中:
PARTITION BY
注意但是这个(以及链接的答案)只能起作用,因为select *, avg(isup::int) over (partition by itemid, group_nr order by logged) as rolling_avg
from (
select *, id - row_number() over (partition by itemid, interval_group order by logged) as group_nr
from (
select *, 'epoch'::timestamp + '3600 seconds'::interval * (extract(epoch from logged)::int4 / 3600) as interval_group
from dummy
) t1
) t2
order by itemid, logged
没有间隙&按顺序显示其表的时间戳字段。如果情况并非如此,那么您需要
id
而不是row_number() over (partition by itemid order by logged) - row_number() over (partition by itemid, interval_group order by logged) as group_nr
。
http://rextester.com/YBSC43615
,如果您打算仅使用每小时群组,则可以使用:
id - row_number() ...
而不是更通用的算术(因为@LaurenzAlbe已经注意到了)。