我正在尝试学习SQL,所以要耐心等待我。我正在使用PostgreSQL 9.3
我想根据日期窗口对列进行平均。我已经能够使用集合interval
编写完成此功能的窗口函数,但我希望能够在不断增长的interval
中完成此操作。我的意思是:
average values from date_0 to date_1
average values from date_0 to date_2
average values from date_0 to date_3
..... so date date_0 stays the same and date_x grows and creates a larger sample
我假设有一种比为每个范围运行查询更好的方法我想平均。任何建议表示赞赏。谢谢。
我尝试创建均匀分布的分档,用于汇总表格的值 我来到这个区间:
(MAX(date) - MIN(date)) / bins
其中date
是表格的列
bins
是我想把桌子分成两部分的箱子数。
date_0
= MIN(日期)
date_n
= MIN(日期)+(间隔* n)
答案 0 :(得分:4)
我建议使用方便的功能 width_bucket()
:
获取每个时间段的平均值(" bin"):
SELECT width_bucket(extract(epoch FROM t.the_date)
, x.min_epoch, x.max_epoch, x.bins) AS bin
, avg(value) AS bin_avg
FROM tbl t
, (SELECT extract(epoch FROM min(the_date)) AS min_epoch
, extract(epoch FROM max(the_date)) AS max_epoch
, 10 AS bins
FROM tbl t
) x
GROUP BY 1;
获得"运行平均值"超过(逐步)生长时间间隔:
SELECT bin, round(sum(bin_sum) OVER w /sum(bin_ct) OVER w, 2) AS running_avg
FROM (
SELECT width_bucket(extract(epoch FROM t.the_date)
, x.min_epoch, x.max_epoch, x.bins) AS bin
, sum(value) AS bin_sum
, count(*) AS bin_ct
FROM tbl t
, (SELECT extract(epoch FROM min(the_date)) AS min_epoch
, extract(epoch FROM max(the_date)) AS max_epoch
, 10 AS bins
FROM tbl t
) x
GROUP BY 1
) sub
WINDOW w AS (ORDER BY bin)
ORDER BY 1;
使用the_date
代替date
作为列名称,避免使用reserved words作为标识符。
由于width_bucket()
目前仅针对double precision
和numeric
实施,因此我从the_date
中提取了纪元值。详情:
Aggregating (x,y) coordinate point clouds in PostgreSQL
答案 1 :(得分:3)
如果您有一组数据,您可以在单独的列中轻松获得所需的数据:
select avg(case when date between date_0 and date_1 then value end) as avg1,
avg(case when date between date_0 and date_2 then value end) as avg2,
. . .
avg(case when date between date_0 and date_n then value end) as avgn
from table t
where date >= date_0;