需要有关data.table的超前/滞后的帮助

时间:2015-08-28 21:43:58

标签: r data.table dplyr

挣扎着一个非常简单的领先/滞后,希望你们能帮忙。假设我有:

set.seed(1234)
foo2 <- data.table(data.frame(Wk = rep(c(1:5), 4), CODE = c(rep('a',5),rep('b',5),rep('c',5),rep('d',5)), METRIC = rnorm(20)))[order(-Wk,CODE)]
currentWeek <- max(foo2[["Wk"]])

foo2

    Wk CODE      METRIC
 1:  5    a  0.85023226
 2:  5    b -1.19452788
 3:  5    c -0.49558344
 4:  5    d  2.12111711
 5:  4    a -0.17378717
 6:  4    b -0.19159377
 7:  4    c  1.00151325
 8:  4    d  0.97291675
 9:  3    a -1.37230189
10:  3    b -0.40273198
11:  3    c  1.70596401
12:  3    d  0.87820363
13:  2    a -0.16999408
14:  2    b  0.54999735
15:  2    c  0.25519600
16:  2    d -1.13460804
17:  1    a -0.17778996
18:  1    b  0.69760871
19:  1    c -0.05315882
20:  1    d  0.35555030

我如何将上述内容更改为:

    Wk CODE      METRIC METRIC_currentWeekMinus1  METRIC_currentWeekMinus2
 1:  5    a  0.85023226 -0.17378717               -1.37230189
 2:  5    b -1.19452788 -0.19159377               -0.40273198
 3:  5    c -0.49558344  1.00151325                1.70596401
 4:  5    d  2.12111711  0.97291675                0.87820363
 5:  4    a -0.17378717  .                         .
 6:  4    b -0.19159377  .                         .
 7:  4    c  1.00151325  .                         .
 8:  4    d  0.97291675  .                         .
 9:  3    a -1.37230189  .                         .
10:  3    b -0.40273198  .                         .
11:  3    c  1.70596401  .                         .
12:  3    d  0.87820363  .                         .
13:  2    a -0.16999408  .                         NA
14:  2    b  0.54999735  .                         NA
15:  2    c  0.25519600  .                         NA
16:  2    d -1.13460804  .                         NA
17:  1    a -0.17778996  NA                        NA
18:  1    b  0.69760871  NA                        NA
19:  1    c -0.05315882  NA                        NA
20:  1    d  0.35555030  NA                        NA

有什么想法吗?提前谢谢!

1 个答案:

答案 0 :(得分:3)

使用dplyr

library(dplyr)
foo2 %>% group_by(CODE) %>% mutate(last1 = lead(METRIC), last2 = lead(METRIC, 2))

data.table,版本v1.9.5(从github安装):

foo2[ , c("last1", "last2") := shift(METRIC, 1:2, type = "lead"), by = CODE]