表t1有20万条记录,前几条记录是:
-----------------------------
| date | id | value |
-----------------------------
| 2/28/2019 | abc1 | 55 |
| 2/28/2019 | abc2 | 44 |
| 2/28/2019 | abc3 | 33 |
| 2/26/2019 | abc1 | 22 |
| 2/26/2019 | abc2 | 12 |
| 2/25/2019 | abc1 | 11 |
| 2/25/2019 | abc3 | 10 |
| 2/24/2019 | abc1 | 10 |
| 2/24/2019 | abc2 | 10 |
-----------------------------
我想从t1中提取abc1,然后找到前一个日期的abc1值(可以是-1天或-2天或-3天...但可以肯定是最近5天)并显示差异(首个日期的值-上一个日期的值)。
我创建了一个适合该查询的查询(但很慢):
select
a.date, a.id, a.value, b.value, a.value-b.value
from
t1 a
inner join
t1 b
on
a.id = b.id
where
b.date = (
select
max(date) from t1
where
date < a.date
and date > dateadd(day, -5, a.date)
)
这很好,但是对于20万条记录(耗时数分钟)非常慢。
如何加快速度? (也许使用RANK或其他更有效的方法。)
预期结果:
2/28/2019 | abc1 | 33 (which is "55 - 22")
2/28/2019 | abc2 | 32 (which is "44 - 12")
2/28/2019 | abc3 | 23 (which is "33 - 10")
谢谢!
答案 0 :(得分:2)
使用lag()
:
select t1.*,
value - lag(value) over (partition by id order by date)
from t1;
无论使用哪种数据库,都应该能够利用(id, date, value)
上的索引。
如果要将其限制为前五天,请使用case
逻辑:
select t1.*,
(case when date < dateadd(day, 5, lag(date) over (partition by id order by date))
then value - lag(value) over (partition by id order by date)
end)
from t1;