我试图找到两个系列之间的滞后。假设有一个变量temp2,其值滞后于temp1,其中滞后不是常数。
library(data.table)
dt <- data.table(
datetime = seq(as.POSIXct("2000-01-01 00:00:00"),as.POSIXct("2000-01-01 09:00:00"), by = "1 hour"),
temp1 = seq(30, 21, by = -1),
temp2 = c(30, seq(30, 25, by = -1), seq(25, 23, by = -1))
)
我希望有一个额外的列“滞后”等于temp1和temp2之间的滞后,所以结果如下所示:
dt <- data.table(
datetime = seq(as.POSIXct("2000-01-01 00:00:00"),as.POSIXct("2000-01-01 09:00:00"), by = "1 hour"),
temp1 = seq(30, 21, by = -1),
temp2 = c(30, seq(30, 25, by = -1), seq(25, 23, by = -1)),
lag = c(0, 1, 1, 1, 1, 1, 2, 2, NA, NA)
)
感谢您的帮助:)
答案 0 :(得分:0)
1)减法如果简单减法就足够了:
library(data.table)
dt[, lag := temp2 - temp1]
,并提供:
> dt
datetime temp1 temp2 lag
1: 2000-01-01 00:00:00 30 30 0
2: 2000-01-01 01:00:00 29 30 1
3: 2000-01-01 02:00:00 28 29 1
4: 2000-01-01 03:00:00 27 28 1
5: 2000-01-01 04:00:00 26 27 1
6: 2000-01-01 05:00:00 25 26 1
7: 2000-01-01 06:00:00 24 25 1
8: 2000-01-01 07:00:00 23 25 2
9: 2000-01-01 08:00:00 22 24 2
10: 2000-01-01 09:00:00 21 23 2
2)dtw 另一种可能是动态时间扭曲。您可能需要根据需要使用不同的参数自定义它,例如尝试:
library(data.table)
library(dtw)
fm <- dt[, dtw(temp1, temp2)]
dt[, lag := tapply(fm$index2 - fm$index1, fm$index1, min)]
,并提供:
> dt
datetime temp1 temp2 lag
1: 2000-01-01 00:00:00 30 30 0
2: 2000-01-01 01:00:00 29 30 1
3: 2000-01-01 02:00:00 28 29 1
4: 2000-01-01 03:00:00 27 28 1
5: 2000-01-01 04:00:00 26 27 1
6: 2000-01-01 05:00:00 25 26 1
7: 2000-01-01 06:00:00 24 25 2
8: 2000-01-01 07:00:00 23 25 2
9: 2000-01-01 08:00:00 22 24 1
10: 2000-01-01 09:00:00 21 23 0
注意: ccf
功能在这里也许有用。