我们说我有下表:
CREATE TABLE stock_prices (
stock TEXT NOT NULL,
date DATE NOT NULL,
price REAL NOT NULL,
UNIQUE (stock, date)
);
我想计算每一天,即前3个月窗口中每只股票的最高价格。
我无法与date - INTERVAL(3 'MONTH')
进行简单的自我加入,因为我的stock_price
表有一些"漏洞"假期和周末。同样,一个天真的窗口也不起作用:
SELECT
stock,
date,
LAST_VALUE(price) OVER (PARTITION BY stock ORDER BY date ROWS 90 PRECEDING)
FROM stock_prices
我几乎想要一个基于当前行的条件的窗框。这在PostgreSQL中可能吗?
答案 0 :(得分:5)
您可以使用函数generate_series ()
填充缺少行的表,因此窗口函数将返回正确的数据。您可以在generate_series ()
中选择指定开始日期和结束日期的报告周期:
select
stock,
date,
price,
max(price) over (partition by stock order by date rows 90 preceding)
from (
select d::date as date, s.stock, sp.price
from generate_series('2016-01-01'::date, '2016-07-28', '1d') g(d)
cross join (
select distinct stock
from stock_prices
) s
left join stock_prices sp on g.d = sp.date and s.stock = sp.stock
) s
order by 1, 2;
这个带有简单子查询的替代解决方案:
select
stock,
date,
price,
(
select max(price)
from stock_prices sp2
where sp2.stock = sp1.stock
and sp2.date >= sp1.date- interval '90days'
and sp2.date <= sp1.date
) highest_price
from
stock_prices sp1
order by 1, 2;
会贵得多。在这种情况下,您应该强制使用索引
create index on stock_prices (stock, date);
答案 1 :(得分:0)
generate_series选项应该可以正常运行,但由于几个月不总是30天,因此它始终不会与日历月对齐。
如果您想使用间隔,您还可以进行自我加入和聚合。这会将每一行连接到符合条件的所有行(在这种情况下,我将间隔设置为1周),并获得该结果集中的最大值:
select a.stock,
a.date,
a.price,
max( b.price )
from stock_prices as a
left join
stock_prices as b
on a.stock = b.stock
and b.date between (a.date - interval '7 days') and a.date
group by a.stock,
a.date,
a.price
order by a.stock,
a.date