一次在data.table中创建一堆滞后变量

时间:2014-05-02 03:36:28

标签: r data.table lag

我试图在data.table中一次创建一堆滞后变量。我希望这些滞后值可以通过车站和土地覆盖。我遇到了一些困难。这是我的示例data.table。

require(data.table)
    r <- structure(list(station = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
    landcover = structure(c(2L, 2L, 2L, 4L, 4L, 4L, 1L, 1L, 1L, 
    3L, 3L, 3L), .Label = c("foam", "Mixed Forest", "other2", 
    "Sand"), class = "factor"), cv = c(0.273287412020818, 0.453346217936644, 
    0.235088531585817, 0.703112865400233, 0.221907230708271, 
    0.278459655651048, 0.376646346809308, 0.662970017835398, 
    0.296458678818467, 0.390335320625924, 0.712476246695341, 
    0.535612484651002)), .Names = c("station", "landcover", "cv"
), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))

# station    landcover        cv
# 1:       A Mixed Forest 0.2732874
# 2:       A Mixed Forest 0.4533462
# 3:       A Mixed Forest 0.2350885
# 4:       A         Sand 0.7031129
# 5:       A         Sand 0.2219072
# 6:       A         Sand 0.2784597
# 7:       B         foam 0.3766463
# 8:       B         foam 0.6629700
# 9:       B         foam 0.2964587
# 10:       B       other2 0.3903353
# 11:       B       other2 0.7124762
# 12:       B       other2 0.5356125

我想创建一堆滞后变量。我甚至不担心此时会产生的NA值。如何在不编写如此多代码的情况下创建类似下面的data.table?我需要它仍然在data.table。

r[, cv.lag1 :=  c(rep(NA,1), head(cv, -1)),by=c("station","landcover")]
r[, cv.lag2 :=  c(rep(NA,2), head(cv, -2)),by=c("station","landcover")]
r[, cv.lag3 :=  c(rep(NA,3), head(cv, -3)),by=c("station","landcover")]
r[, cv.lag4 :=  c(rep(NA,4), head(cv, -4)),by=c("station","landcover")]
r[, cv.lag5 :=  c(rep(NA,5), head(cv, -5)),by=c("station","landcover")]
r[, cv.lag6 :=  c(rep(NA,6), head(cv, -6)),by=c("station","landcover")]
r[, cv.lag7 :=  c(rep(NA,7), head(cv, -7)),by=c("station","landcover")]
r[, cv.lag8 :=  c(rep(NA,8), head(cv, -8)),by=c("station","landcover")]
r[, cv.lag9 :=  c(rep(NA,9), head(cv, -9)),by=c("station","landcover")]
r[, cv.lag10 := c(rep(NA,10), head(cv, -10)),by=c("station","landcover")]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA

1 个答案:

答案 0 :(得分:3)

感谢Arun在优雅的单行解决方案中提供答案。

r[, c(paste("cv.lag", 1:10, sep="")) := lapply(1:10, function(i) c(rep(NA, i), head(cv, -i))), by=list(station,landcover)]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA