我的Postgresql 9.1数据库中有以下表格:
select * from ro;
date | shop_id | amount
-----------+----------+--------
2013-02-07 | 1001 | 3
2013-01-31 | 1001 | 2
2013-01-24 | 1001 | 1
2013-01-17 | 1001 | 5
2013-02-10 | 1001 | 10
2013-02-03 | 1001 | 4
2012-12-27 | 1001 | 6
2012-12-20 | 1001 | 8
2012-12-13 | 1001 | 4
2012-12-06 | 1001 | 3
2012-10-29 | 1001 | 3
我试图获得一个移动平均线,将数据与过去3个星期四的数据进行比较而不包括当前的周四。这是我的疑问:
select date, shop_id, amount, extract(dow from date),
avg(amount) OVER (PARTITION BY extract(dow from date) ORDER BY date DESC
ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING)
from ro
where extract(dow from date) = 4
这是给出的结果
date | shop_id | amount | date_part | avg
-----------+----------+--------+-----------+--------------------
2013-02-07 | 1001 | 3 | 4 | 2.0000000000000000
2013-01-31 | 1001 | 2 | 4 | 2.6666666666666667
2013-01-24 | 1001 | 1 | 4 | 4.0000000000000000
2013-01-17 | 1001 | 5 | 4 | 6.3333333333333333
2012-12-27 | 1001 | 6 | 4 | 6.0000000000000000
2012-12-20 | 1001 | 8 | 4 | 5.0000000000000000
2012-12-13 | 1001 | 4 | 4 | 3.5000000000000000
2012-12-06 | 1001 | 3 | 4 | 3.0000000000000000
我希望
date | shop_id | amount | date_part | avg
-----------+----------+--------+-----------+--------------------
2013-02-07 | 1001 | 3 | 4 | 2.6666666666666667
2013-01-31 | 1001 | 2 | 4 | 4.0000000000000000
2013-01-24 | 1001 | 1 | 4 | 6.3333333333333333
2013-01-17 | 1001 | 5 | 4 | 6.0000000000000000
2012-12-27 | 1001 | 6 | 4 | 5.0000000000000000
2012-12-20 | 1001 | 8 | 4 |
2012-12-13 | 1001 | 4 | 4 |
2012-12-06 | 1001 | 3 | 4 |
答案 0 :(得分:15)
select
"date",
shop_id,
amount,
extract(dow from date),
case when
row_number() over (order by date) > 3
then
avg(amount) OVER (
ORDER BY date DESC
ROWS BETWEEN 1 following AND 3 FOLLOWING
)
else null end
from (
select *
from ro
where extract(dow from date) = 4
) s
OP的查询有什么问题是帧规范:
ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING
除此之外,我的查询通过在应用昂贵的窗口函数之前过滤星期四来避免不需要的计算。
如果有必要按shop_id进行分区,那么显然会将partition by shop_id
添加到两个函数avg
和row_number
。
答案 1 :(得分:6)
我认为更好的答案可能是:
SELECT date, shop_id, amount,
extract(dow from date) AS dow,
CASE WHEN count(amount) OVER w = 3
THEN avg(amount) OVER w END AS average_amt
FROM ro
WHERE extract(dow from date) = 4
WINDOW w AS (ORDER BY date DESC ROWS BETWEEN 1 FOLLOWING AND 3 FOLLOWING)
我认为使用相同的窗口检查窗口和中的行数取平均值是更清晰的。 (这也可以保存两个窗口聚合,如原始答案中所示。)
关于早期答案中的声明"我的查询在应用昂贵的窗口函数之前通过过滤星期四避免了不需要的计算,这也适用于OP建议的查询和我的查询,作为附加EXPLAIN
要么显示。